On Mar 5, 2010, at 8:52 AM, abc def wrote: > Hello, > From within the MPI fortran program I run the following command: > > CALL SYSTEM("cd " // TRIM(dir) // " ; mpirun -machinefile ./machinefile -np 1 > /home01/group/Execute/DLPOLY.X > job.out 2> job.err ; cd - > /dev/null")
That is guaranteed not to work. The problem is that mpirun sets environmental variables for the original launch. Your system call carries over those envars, causing mpirun to become confused. > > where "dir" is a process-number-dependent directory, to ensure the processes > don't over-write each other, and machinefile is written earlier by using > hostname to obtain the node of the current process, so this new program > launches using the same node as the one that launches it. > > In fortran, the program automatically waits until the system call is complete. > > Since you mentioned MPI_COMM_SPAWN, I looked into this. I read that it's > non-blocking, so somehow I'd have to prevent the program from moving forwards > until it was complete, and secondly, I need to "cd" into the directory I > mentioned above, before running the external command, and I don't know how > one would achieve this with this command. > > Do you think MPI_COMM_SPAWN can help? It's the only method supported by the MPI standard. If you need it to block until this new executable completes, you could use a barrier or other MPI method to determine it. > I appreciate your time. > > From: r...@open-mpi.org > Date: Fri, 5 Mar 2010 07:55:59 -0700 > To: us...@open-mpi.org > Subject: Re: [OMPI users] running external program on same > processor (Fortran) > > How are you trying to start this external program? With an MPI_Comm_spawn? Or > are you just fork/exec'ing it? > > How are you waiting for this external program to finish? > > On Mar 5, 2010, at 7:52 AM, abc def wrote: > > Hello, > > Thanks for the comments. Indeed, until yesterday, I didn't realise the > difference between MVAPICH, MVAPICH2 and Open-MPI. > > This problem has moved from mvapich2 to open-mpi now however, because I now > realise that the production environment uses Open-MPI, which means my > solution for mvapich2 doesn't work now. So if I may ask your kind assistance: > > Just to re-cap, I have an MPI fortran program, which runs on N nodes, and > each node needs to run an external program. This is external program was > written for MPI, although I want to run it in serial as part of the process > on each node. > > Is there any way at all to launch this external MPI program so it's treated > simply as a (serial) extension of the current node's processes? As I say, the > MPI originating program simply waits for the external program to finish > before continuing, so it it's essentially a bit like a "subroutine", in that > sense. > > I'm having problems launching this external program from within my MPI > program, under the open-mpi system, even without worrying about node > assignment, and I think this might be because the system detects that I'm > trying to launch another process from one of the nodes, and stops it. I'm > guessing here, but it simply stops with an error saying the MPI process was > stopped. > > Any help is very much appreciated; I have been going at this for more than a > day now and don't seem to be getting anywhere. > > Thank you! > > From: r...@open-mpi.org > Date: Wed, 3 Mar 2010 07:24:32 -0700 > To: us...@open-mpi.org > Subject: Re: [OMPI users] running external program on same processor > (Fortran) > > It also would have been really helpful to know that you were using MVAPICH > and -not- Open MPI as this mailing list is for the latter. We could have > directed you to the appropriate place if we had known. > > > On Mar 3, 2010, at 5:17 AM, abc def wrote: > > I don't know (I'm a little new to this area), but I figured out how to get > around the problem: > > Using SGE and MVAPICH2, the "-env MV2_CPU_MAPPING 0:1....." option in mpiexec > seems to do the trick. > > So when calling the external program with mpiexec, I map the called process > to the current core rank, and it seems to stay distributed and separated as I > want. > > Hope someone else finds this useful in the future. > > > Date: Wed, 3 Mar 2010 13:13:01 +1100 > > Subject: Re: [OMPI users] running external program on same processor > > (Fortran) > > > > Surely this is the problem of the scheduler that your system uses, > > rather than MPI? > > > > > > On Wed, 2010-03-03 at 00:48 +0000, abc def wrote: > > > Hello, > > > > > > I wonder if someone can help. > > > > > > The situation is that I have an MPI-parallel fortran program. I run it > > > and it's distributed on N cores, and each of these processes must call > > > an external program. > > > > > > This external program is also an MPI program, however I want to run it > > > in serial, on the core that is calling it, as if it were part of the > > > fortran program. The fortran program waits until the external program > > > has completed, and then continues. > > > > > > The problem is that this external program seems to run on any core, > > > and not necessarily the (now idle) core that called it. This slows > > > things down a lot as you get one core doing multiple tasks. > > > > > > Can anyone tell me how I can call the program and ensure it runs only > > > on the core that's calling it? Note that there are several cores per > > > node. I can ID the node by running the hostname command (I don't know > > > a way to do this for individual cores). > > > > > > Thanks! > > > > > > ==== > > > > > > Extra information that might be helpful: > > > > > > If I simply run the external program from the command line (ie, type > > > "/path/myprogram.ex <enter>"), it runs fine. If I run it within the > > > fortran program by calling it via > > > > > > CALL SYSTEM("/path/myprogram.ex") > > > > > > it doesn't run at all (doesn't even start) and everything crashes. I > > > don't know why this is. > > > > > > If I call it using mpiexec: > > > > > > CALL SYSTEM("mpiexec -n 1 /path/myprogram.ex") > > > > > > then it does work, but I get the problem that it can go on any core. > > > > > > ______________________________________________________________________ > > > Do you want a Hotmail account? Sign-up now - Free > > > _______________________________________________ > > > users mailing list > > > us...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > Not got a Hotmail account? Sign-up now - Free > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > Not got a Hotmail account? Sign-up now - Free > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > Got a cool Hotmail story? Tell us now > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users