WHAT:   Proposal to add two new command line options that will allow us to
        replace the current need to separately launch a persistent daemon to
        support connect/accept operations

WHY:    Remove problems of confusing multiple allocations, provide a cleaner
        method for connect/accept between jobs

WHERE:  minor changes in orterun and orted, some code in rmgr and each pls
        to ensure the proper jobid and connect info is passed to each
        app_context as it is launched

TIMOUT: 8/10/07

We currently do not support connect/accept operations in a clean way. Users
are required to first start a persistent daemon that operates in a
user-named universe. They then must enter the mpirun command for each
application in a separate window, providing the universe name on each
command line. This is required because (a) mpirun will not run in the
background (in fact, at one point in time it would segfault, though I
believe it now just hangs), and (b) we require that all applications using
connect/accept operate under the same HNP.

This is burdensome and appears to be causing problems for users as it
requires them to remember to launch that persistent daemon first -
otherwise, the applications execute, but never connect. Additionally, we
have the problem of confused allocations from the different login sessions.
This has caused numerous problems of processes going to incorrect locations,
allocations timing out at different times and causing jobs to abort, etc.

What I propose here is to eliminate the confusion in a manner that minimizes
code complexity. The idea is to utilize our so-painfully-developed multiple
app_context capability to have the user launch all the interacting
applications with the same mpirun command. This not only eliminates the
annoyance factor for users by eliminating the need for multiple steps and
login sessions, but also solves the problem of ensuring that all
applications are running in the same allocation (so we don't have to worry
any more about timeouts in one allocation aborting another job).

The proposal is to add two command line options that are associated with a
specific app_context (feel free to redefine the name of the option - I don't
personally care):

1. --independent-job - indicates that this app_context is to be launched as
an independent job. We will assign it a separate jobid, though we will map
it as part of the overall command (e.g., if by slot and no other directives
provided, it will start mapping where the prior app_context left off)

2. --connect x,y,z  - only valid when combined with the above option,
indicates that this independent job is to be MPI-connected to app_contexts
x,y,z (where x,y,z are the number of the app_context, counting from the
beginning of the command - you choose if we start from 0 or 1).
Alternatively, we can default to connecting to everyone, and then use
--disconnect to indicate we -don't- want to be connected.

Note that this means the entire allocation for the combined app_contexts
must be provided. This helps the RTE tremendously to keep things straight,
and ensures that all the app_contexts will be able to complete (or not) in a
synchronized fashion.

It also allows us to eliminate the persistent daemon and multiple login
session requirements for connect/accept. That does not mean we cannot have a
persistent daemon to create a virtual machine, assuming we someday want to
support that mode of operation. This simply removes the requirement that the
user start one just so they can use connect/accept.

Comments?


Reply via email to