Hello,

Following advice on other branches of this thread, I've managed to get to the 
point where I can spawn the second program. I've also managed to ensure that 
I'm running in an open-mpi environment during the testing.

I would like to run my next steps by everyone in case anyone with more 
experience knows a better way:

1. Since Spawn is non-blocking, but I need the parent to wait until the child 
completes, I am thinking there must be a way to pass a variable from the child 
to the parent just prior to the FINALIZE command in the child, to signal that 
the parent can pick up the output files from the child. Am I right in assuming 
that the message from the child to the parent will go to the correct parent 
process? The value of "parent" in "CALL MPI_COMM_GET_PARENT(parent, ierr)" is 
the same in all spawned processes, which is why I ask this question.

2. By launching the parent with the "--mca mpi_yield_when_idle 1" option, the 
child should be able to take CPU power from any blocked parent process, thus 
avoiding the busy-poll problem mentioned below. If each host has 4 processors 
and I'm running on 2 hosts (ie, 8 processors in total), then I also assume that 
the spawned child will launch on the same host as the associated parent?

Does anyone have any better suggestions? Since I'm quite new to this, I thought 
it might be best to check...

Thanks!

> From: jsquy...@cisco.com
> Date: Fri, 5 Mar 2010 15:02:57 -0500
> To: us...@open-mpi.org
> Subject: Re: [OMPI users] running externalprogram     on      same    
> processor       (Fortran)
> 
> On Mar 5, 2010, at 2:38 PM, Ralph Castain wrote:
> 
> >> CALL SYSTEM("cd " // TRIM(dir) // " ; mpirun -machinefile ./machinefile 
> >> -np 1 /home01/group/Execute/DLPOLY.X > job.out 2> job.err ; cd - > 
> >> /dev/null")
> > 
> > That is guaranteed not to work. The problem is that mpirun sets 
> > environmental variables for the original launch. Your system call carries 
> > over those envars, causing mpirun to become confused.
> 
> You should be able to use MPI_COMM_SPAWN to launch this MPI job.  Check the 
> man page for MPI_COMM_SPANW; I believe we have info keys to specify things 
> like what hosts to launch on, etc.
> 
> >> Do you think MPI_COMM_SPAWN can help?
> > 
> > It's the only method supported by the MPI standard. If you need it to block 
> > until this new executable completes, you could use a barrier or other MPI 
> > method to determine it.
> 
> I believe that the user said they wanted to use the same cores as their 
> original MPI job occupies for the new job -- they basically want the old job 
> to block until the new job completes.  Keep in mind that OMPI busy-polls 
> waiting for progress, so you might actually get hosed here (two procs 
> competing for time on the same core).
> 
> I'm not immediately thinking of a good way to avoid this issue -- perhaps you 
> could kludge something up such that the parent job polls on sleep() and 
> checking to see if a message has arrived from the child (i.e., the last thing 
> the child does before it calls MPI_FINALIZE is to send a message to its 
> parents and then MPI_COMM_DISCONNECT from its parents).  If the parent finds 
> that it has a message from the child(ren), it can MPI_COMM_DISCONNECT and 
> continue processing.
> 
> Kinda hackey, but it might work...?
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
                                          
_________________________________________________________________
Do you have a story that started on Hotmail? Tell us now
http://clk.atdmt.com/UKM/go/195013117/direct/01/

Reply via email to