Thank you Doug, Ralph, and Mattijs for the helpful input.  Some replies to 
Ralph's message and a question inlined here. -- Brian

> -----Original Message-----
> From: users-boun...@open-mpi.org
> [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain
> Sent: Monday, October 20, 2008 5:38 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] OpenMPI runtime-specific
> environment variable?
>
> It depends on what you are trying to do. If you intend to use
> this solely with OMPI 1.2.x, then you could use some of
> those. However, they are risky as they are in general
> internal to OMPI's infrastructure - and thus, subject to
> change from one release to another.

Ok, sounds like the variables I called out aren't good choices.

> We do have some environmental variables that we guarantee to
> be "stable" across releases. You could look for
> OMPI_COMM_WORLD_SIZE, or OMPI_UNIVERSE_SIZE (there are a
> couple of others as well, but any of these would do).

Q: I just wrote a simple C++ program, including mpi.h and getenv to check for 
these two variables and compiled with the mpicxx wrapper (openmpi-1.2.5 as 
distributed with RHEL5).  When running this program with orterun, these 
variables come back NULL from the environment.  The same is true if I just 
orterun a shell script to dump the environment to a file.  Am I making an 
obvious mistake here?

> However, these will only tell you that the job was launched
> via OMPI's mpirun - it won't tell you that it was a parallel
> job. It could be a serial job that just happened to be
> launched by mpirun. For example, we set the same
> environmental params when we execute "mpirun hostname" -
> mpirun has no way of knowing the parallel vs serial nature of
> the app it is launching, so it sets all the variables
> required by a parallel job just-in-case.

Understood -- we have some other logic to (hopefully) handle this case.

> Likewise, these variables will only tell you it is a parallel
> job launched by OMPI. If you use another MPI (e.g., MVAPICH),
> none of these would be set - yet it would still be a parallel job.

Also understood.  While ultimately we'll probably redesign the code base, right 
now we have tests specific to each MPI implementation for which we have known 
use cases.  So adding an OpenMPI-specific test is actually what I'm after in 
the short term.

> So it boils down to your particular mode of operation. If you
> only run with OMPI, and you would only launch via OMPI's
> mpirun if you wanted to execute in a parallel mode, then you
> could look for either of those two environmental params.
> Otherwise, you may have to do as Doug suggests and create
> your own "flag".

Doug is right that we could use an additional command line flag to indicate MPI 
runs, but at this point, we're trying to hide that from the user, such that all 
they have to do is run the binary vs. orterun/mpirun the binary and we detect 
whether it's a serial or parallel run.

As for parsing the command line $argv[0] before MPI_Init, I don't think it will 
help here.  While MPICH implementations typically left args like -p4pg 
-p4amslave on the command line, I don't see that coming from OpenMPI-launched 
jobs.

Brian


Reply via email to