Re: [OMPI devel] MPI_Comm_spawn[_multiple] and orted

Pak Lui Wed, 31 May 2006 15:31:36 -0400

Ralph Castain wrote:

Hi Pak
I'm afraid I don't fully understand your question, so forgive me if Idon't seem to address the problem adequately. As I understand it, youare asking about the scenario where someone wants to execute multiplecalls of mpirun, with the applications executing on the same set ofnodes. Your question is: why does OpenRTE spawn a new daemon (orted) onthe node each time we execute mpirun - why doesn't it just use theexisting one to launch the new application process(es)?
Assuming I have the question right, the short answers are "may not bepermitted" and "not yet implemented". :-)

yes, Ralph, that is precisely the question. good thing that you'vefigured that out :)

First, the fact that an orted already exists on a node is not sufficientto allow us to use it again for another application. The orted must bepersistent or else we do not allow a new application to re-use it. Thisis required because the existing orted will go away when its originalapplication is done executing - if we use it as our parent to launchanother child, then the new application process will "die" when theoriginal one completes. Obviously, that isn't desirable.

okay. I used to think that if orted is able to stay and fork otherprocesses, but I didn't realize orted will go away once the parentprocess finishes.

Second, even though you can launch persistent orteds today, none of thecurrent components in the resource management subsystems actually knowhow to use them yet. This is something we planned to implement in thefuture, but there simply hasn't been time to do so yet.
So the bottom line is that there really is no way around the need tolaunch a new orted on each node every time the user issues an mpiruncommand.
I hope that answers your question. If not, please don't hesitate to letme know.

Thanks for pointing out these issues. I was hoping something I didn'tknow may solve my problem. I guess there may not be a good workaroundfor this limitation due to SGE slots. We could try to track and set sometop limit for the number of times that qrsh can exec, before the spawnprogram uses up all the available SGE slots and errors out.

Ralph



Pak Lui wrote:
Hi,

When I run a spawn program over rsh/ssh, I notice that each time the
child program gets spawned, it will need to establish a new rsh/ssh
connection to the remote node to launch orted on that node, even the
parent executable and the orted are running on that node.

So I wonder if there is any way that we can use the parent orted to
launch the child program if they happen to be on the same node?
I try to compare to the spawn program to the scenario where I runmultiple executables in one mpirun command. For this run, I onlyestablish one connection to the remote node only, and both executablesshared the same remote connection.
% ./mpirun -np 2 -host burl-ct-v440-5 -prefix `pwd`/.. sleep 12 : -np 2
sleep 10
Password:

15015 /workspace/paklui/ompi/trunk/builds/sparc32-g/bin/../bin/orted
--bootprox
   15017 sleep 12
   15019 sleep 12
   15021 sleep 10
   15023 sleep 10
The reason that I want to find out if it is possible for orted tolaunch child executable(s) without having to establish a newconnection, is because the number of times that I can run 'qrsh' inSGE (or N1GE) is actually depended on the number of slots that theuser initially allocated. That the slot number corresponds to thenumber of CPUs on a node. Each slot allows one 'qrsh' connection.
The issue is when I try to run a spawn job on a single node, or acluster of many 1-cpu nodes under SGE. The number of times that theprogram can spawn is limited by 'qrsh', that it forbids the childprogram to connect to the same node where the parent executable'sorted might be already running there.
I am curious to see if I can find some solution to the problem here. Iam also looking to see if there are some tricks in SGE to get aroundthis issue, but workaround I can see aren't pretty though. So Iwelcome your questions, comments or suggestions on this.



--

Thanks,

- Pak Lui
pak....@sun.com

Re: [OMPI devel] MPI_Comm_spawn[_multiple] and orted

Reply via email to