Re: [OMPI devel] [RFC] New command line options to replace persistent daemon operations

Ralph Castain Fri, 27 Jul 2007 10:14:07 -0400

On 7/27/07 7:58 AM, "Terry D. Dontje" <terry.don...@sun.com> wrote:

> Ralph Castain wrote:
> 
>> WHAT:   Proposal to add two new command line options that will allow us to
>>        replace the current need to separately launch a persistent daemon to
>>        support connect/accept operations
>> 
>> WHY:    Remove problems of confusing multiple allocations, provide a cleaner
>>        method for connect/accept between jobs
>> 
>> WHERE:  minor changes in orterun and orted, some code in rmgr and each pls
>>        to ensure the proper jobid and connect info is passed to each
>>        app_context as it is launched
>> 
>>  
>> 
> It is my opinion that we would be better off attacking the issues of
> the persistent daemons described below then creating a new set of
> options to mpirun for process placement.  (more comments below on
> the actual proposal).

Non-trivial problems - we haven't figured them out in three years of
occasional effort. It isn't clear that they even -can- be solved when
considering the problem of running in multiple RM-based allocations.

I'll try to provide more detail on the problems when I return from my quick
trip...


> 
>> TIMOUT: 8/10/07
>> 
>> We currently do not support connect/accept operations in a clean way. Users
>> are required to first start a persistent daemon that operates in a
>> user-named universe. They then must enter the mpirun command for each
>> application in a separate window, providing the universe name on each
>> command line. This is required because (a) mpirun will not run in the
>> background (in fact, at one point in time it would segfault, though I
>> believe it now just hangs), and (b) we require that all applications using
>> connect/accept operate under the same HNP.
>> 
>> This is burdensome and appears to be causing problems for users as it
>> requires them to remember to launch that persistent daemon first -
>> otherwise, the applications execute, but never connect. Additionally, we
>> have the problem of confused allocations from the different login sessions.
>> This has caused numerous problems of processes going to incorrect locations,
>> allocations timing out at different times and causing jobs to abort, etc.
>> 
>> What I propose here is to eliminate the confusion in a manner that minimizes
>> code complexity. The idea is to utilize our so-painfully-developed multiple
>> app_context capability to have the user launch all the interacting
>> applications with the same mpirun command. This not only eliminates the
>> annoyance factor for users by eliminating the need for multiple steps and
>> login sessions, but also solves the problem of ensuring that all
>> applications are running in the same allocation (so we don't have to worry
>> any more about timeouts in one allocation aborting another job).
>> 
>> The proposal is to add two command line options that are associated with a
>> specific app_context (feel free to redefine the name of the option - I don't
>> personally care):
>> 
>> 1. --independent-job - indicates that this app_context is to be launched as
>> an independent job. We will assign it a separate jobid, though we will map
>> it as part of the overall command (e.g., if by slot and no other directives
>> provided, it will start mapping where the prior app_context left off)
>> 
>>  
>> 
> I am unclear what does the option --connect really do?  The MPI codes
> actually
> have to call MPI_Comm_connect to really connect to a process.  Can we
> get away
> with just the above option?


You are right - connect doesn't need to exist. I was thinking it would just
minimize the startup message as I wouldn't bother sharing RTE info across
jobs that weren't "connected". However, for MPI users, this probably would
be confusing, so I would suggest just dropping it. With the routed rml, it
won't have that much impact anyway (I think).


> 
>> 2. --connect x,y,z  - only valid when combined with the above option,
>> indicates that this independent job is to be MPI-connected to app_contexts
>> x,y,z (where x,y,z are the number of the app_context, counting from the
>> beginning of the command - you choose if we start from 0 or 1).
>> Alternatively, we can default to connecting to everyone, and then use
>> --disconnect to indicate we -don't- want to be connected.
>> 
>> Note that this means the entire allocation for the combined app_contexts
>> must be provided. This helps the RTE tremendously to keep things straight,
>> and ensures that all the app_contexts will be able to complete (or not) in a
>> synchronized fashion.
>> 
>> It also allows us to eliminate the persistent daemon and multiple login
>> session requirements for connect/accept. That does not mean we cannot have a
>> persistent daemon to create a virtual machine, assuming we someday want to
>> support that mode of operation. This simply removes the requirement that the
>> user start one just so they can use connect/accept.
>> 
>> Comments?
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>  
>> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] [RFC] New command line options to replace persistent daemon operations

Reply via email to