Re: [O-MPI users] Configuring process startup on OS X

Brian Barrett Sat, 4 Feb 2006 12:46:13 -0500

On Jan 30, 2006, at 3:39 PM, Drew McCormack wrote:

Hello Brian,
I sent the email below several days ago. I thought maybe you hadmissed it, so I'm sending it again. Maybe you deliberately ignoredit or didn't have time to answer. That's fine, feel free to ignoreit again ;-)

Sorry about the slow response, but last week was our quarterly OpenMPI meeting. Responses to mailing lists are (unfortunately) a bitslower when we're all spending really long days trying to improveOpen MPI :).

Thanks for the information. I have been playing with OpenMPI andXgrid a little this week, and hadn't had much luck. This emailhelps a lot.
The XGrid starter currently looks for a couple of environment
variables to decide if it can be used.  Currently, the XGrid process
starter only supports the basic password authentication to the
controller.  As such, the two environment variables the XGrid starter
looks for are XGRID_CONTROLLER_HOSTNAME and
XGRID_CONTROLLER_PASSWORD.  These are the same environment variables
that the 'xgrid' command-line submission process uses.
Do you mean on the client/submission machine, or the agent machineswhere the applications are run?
I guess you mean the client, right?
So, I guess I have to make sure I set these environment variables,rather than just using the -p and -h xgrid command options.

Correct, these should be set on the client / submission machine. Theagent machines have their own authentication mechanism to connect tocontroller.

The reason I am a little confused is that I am pretty sure with ourother MPI implementations, that mpirun gets called on thecomputational node after the queueing system has started the jobrunning. What you seem to be indicating is that mpirun replaces thequeueing system call in this case, and is issued from thesubmission node.

Yes, this is one place that our XGrid support is a little differentthan other batch queuing systems. At the time that our XGrid supportwas written, XGrid did not provide a mechanism for running a scriptin the traditional sense (like PBS or SGE might do). We mighteventually support a different submission mechanism to better allowintegration of Open MPI into user applications, but this is still inthe development phase.

We would love to hear any comments on how people want to use Open MPIand XGrid together. I can't promise we'll implement all the ideas,but it will help give us direction on where to spend our developmenttime.

The restriction that Open MPI be installed on all nodes is a slightly
more difficult problem.  Open MPI usually builds as a shared library
with a bunch of dynamically loaded shared objects, complicating the
list of what must be migrated.  Even if statically linked, there is
still a helper process we have to migrate out with your application
(to deal with standard I/O in the expected way, along with some other
features that are much easier to implement with a helper daemon).
I am happy to install OpenMPI everywhere at this point, but in thelong run, it would be great to be able to run OpenMPI/Xgrid appswithout requiring preinstalled components, even if the daemon needsto be sent via the network.

Yes - this is probably a year or so out, but it is something I wouldlike to implement. Of course, it would mean giving up the ability tobuild Open MPI as a set of shared libraries, so some flexibilitywould be lost in the process.

To use the XGrid system, make sure that the XGrid controller is
properly configured to use password-based authentication.  Then
issues the following commands (assuming tcsh)

     % setenv XGRID_CONTROLLER_HOSTNAME mycomputer.apple.com
     % setenv XGRID_CONTROLLER_PASSWORD pword
     % mpirun -np X ./myapp
I am assuming this is from the client/submission machine. So mpirunreplaces the xgrid command. I guess I never need to use the xgridcommand for OpenMPI/Xgrid jobs (?)


Correct - you use mpirun instead of the xgrid command to submit jobs.

If this is the case, my next question is, how do I supply the usualxgrid options, such as working directory, standard input file, etc?Or is that simply not possible?Do I simply have to have some other way (eg ssh) to get files to/from agent machines, like I would for a batch system like PBS?

It looks like I never implemented those options (shame on me). I'veadded that to my to-do list, although I can't give an accurate time-table for implementation at this point. One thing to note is thatrather than using XGrid's standard input/output forwarding services,we use Open MPI's services. So if you do:


  mpirun -np 2 ./myapp < foo.txt

the result will be that rank 0 in the job will be able to read thecontents of foo.txt from standard input. There is a small bug inthis that will be fixed in 1.0.2 (that applies to all scenarios, notjust XGrid). Also, all standard output / standard error is outputthrough mpirun's standard output / standard error, so you can dosomething like:


  mpirun -np 2 ./myapp < foo.txt > bar.txt

to redirect foo.txt into rank 0's standard input and redirect allstandard output from all ranks into the file bar.txt.

If I can get it all working, I will write up a few instructions onmy web site, which may take the pressure of you to generate some docs.

Thanks. I don't think it gets me off the hook with my boss ;), butthe more resources, the better for the Mac community.



Brian

--
  Brian Barrett
  Open MPI developer
  http://www.open-mpi.org/

Re: [O-MPI users] Configuring process startup on OS X

Reply via email to