Jeff and I were talking about trac 2035 and the handling of mpirun command-line options. While most mpirun options have long, multi-character names prefixed with a double dash, OMPI had originally also wanted to support combinations of short names (e.g., "mpirun -hvq", even if we don't document such combinations) as well as legacy single-dash names (e.g., "-host"). To improve diagnosibility of error messages and simplify the source code and user documentations, some simplifications seemed in order. Since the command-line parsing is shared not only by mpirun but by all OMPI command-line interfaces, however, Jeff suggested an RFC. So, here goes.
Title: RFC: Drop mpirun Short-Name Combinations

RFC: Drop mpirun Short-Name Combinations

WHAT: No longer support the combination of multiple mpirun short-name (single-character) options into a single argument. E.g., do not allow users to combine mpirun -h -q -v into mpirun -hqv.

Also, no longer describe separate single-dash and double-dash names such as -server-wait-time and --server-wait-time. Simply give one name per option and indicate that it could be prefixed with either a single or a double dash.

WHY: To improve the diagnosibility of error messages and simplify the description and support of mpirun options.

WHERE: Basically, in opal/util/cmd_line.c.

WHEN: Upon acceptance.

TIMEOUT: May 7, 2010.


WHY (details)

Definitions

There are three kinds of mpirun option names:

kind of nameprefixlengthexample
long name--multi-character--verbose
short name -single-character -v
single-dash name -multi-character -np

Background

We had wanted to support long and short names.

Short names were supposed to be combinable. E.g., instead of ls -l -t -r, just write ls -ltr.

To support backwards compatibility with options that had become well-known from other MPI implementations, we also wanted to support certain short names, such as -np or -host. That is, even though the option starts with a single-dash, we would first check to see if it were a special recognized "single-dash" option name. Only if that check failed would we expand the argument further to parse it as a combination of short names.

Obfuscates Error Messages

Unfortunately, the resulting, more complicated grammar leads to misleading error messages. E.g., consider this example from trac 2035:

% mpirun -tag-output -np 4 -nperslot 1 -H saem9,saem10 hostname
--------------------------------------------------------------------------
mpirun was unable to launch the specified application as it could not find an executable:

Executable: -p
Node: saem9

while attempting to start process rank 0.
--------------------------------------------------------------------------

The point of the ticket was mostly that a misspelled option is handled as an unfound executable, but it also points out that we end up reporting on an option (-p) that from the user's perspective isn't even on the command line in the first place. What has happened is that an option (-nperslot) was not recognized, the first character (n) was recognized, the option was parsed as a combination of short names, and one of those short names (-p) was not recognized.

There are different ways of cleaning all of this up, but a simple solution is just not to support short-name combinations.

Fringe Functionality

The ability to combine short names into a single "-" option is fringe functionality for mpirun anyhow.

We don't document this ability in the first place.

Further, we don't have that many short names -- 10, out of a total of 82 options -- and many combinations don't make much sense. The ability to combine options makes most sense for utilities that use short option names, and then if those options don't take arguments. E.g., ls -ltr in place of ls -l -t -r. The mpirun options just aren't like that.

Simplify Single-/Double-Dash Usage

We were going to support single-dash (multi-character) names only sparingly and only for backwards compatibility with well-established options from other MPIs. In reality, we routinely add a single-dash name for each new option we introduce.

We end up having both single-dash and double-dash names, making both source code and user documentation less readable.

However, ultimately the source code doesn't even check these distinctions when parsing the mpirun command line. For example, we go to the effort in our source code and user documentation to distinguish between -server-wait-time and --server-wait-time, and between -rf and --rankfile. When options are parsed, however, we disregard any such distinctions. E.g., --rf and -rankfile are recognized.

Other Issues

The command-line parser is not only for mpirun, however, but for all OMPI command-line interfaces. Hence, this RFC.

Reply via email to