How are you trying to start this external program? With an MPI_Comm_spawn? Or 
are you just fork/exec'ing it?

How are you waiting for this external program to finish?

On Mar 5, 2010, at 7:52 AM, abc def wrote:

> Hello,
> 
> Thanks for the comments. Indeed, until yesterday, I didn't realise the 
> difference between MVAPICH, MVAPICH2 and Open-MPI.
> 
> This problem has moved from mvapich2 to open-mpi now however, because I now 
> realise that the production environment uses Open-MPI, which means my 
> solution for mvapich2 doesn't work now. So if I may ask your kind assistance:
> 
> Just to re-cap, I have an MPI fortran program, which runs on N nodes, and 
> each node needs to run an external program. This is external program was 
> written for MPI, although I want to run it in serial as part of the process 
> on each node.
> 
> Is there any way at all to launch this external MPI program so it's treated 
> simply as a (serial) extension of the current node's processes? As I say, the 
> MPI originating program simply waits for the external program to finish 
> before continuing, so it it's essentially a bit like a "subroutine", in that 
> sense.
> 
> I'm having problems launching this external program from within my MPI 
> program, under the open-mpi system, even without worrying about node 
> assignment, and I think this might be because the system detects that I'm 
> trying to launch another process from one of the nodes, and stops it. I'm 
> guessing here, but it simply stops with an error saying the MPI process was 
> stopped.
> 
> Any help is very much appreciated; I have been going at this for more than a 
> day now and don't seem to be getting anywhere.
> 
> Thank you!
> 
> From: r...@open-mpi.org
> Date: Wed, 3 Mar 2010 07:24:32 -0700
> To: us...@open-mpi.org
> Subject: Re: [OMPI users] running external program on same    processor       
> (Fortran)
> 
> It also would have been really helpful to know that you were using MVAPICH 
> and -not- Open MPI as this mailing list is for the latter. We could have 
> directed you to the appropriate place if we had known.
> 
> 
> On Mar 3, 2010, at 5:17 AM, abc def wrote:
> 
> I don't know (I'm a little new to this area), but I figured out how to get 
> around the problem:
> 
> Using SGE and MVAPICH2, the "-env MV2_CPU_MAPPING 0:1....." option in mpiexec 
> seems to do the trick.
> 
> So when calling the external program with mpiexec, I map the called process 
> to the current core rank, and it seems to stay distributed and separated as I 
> want.
> 
> Hope someone else finds this useful in the future.
> 
> > Date: Wed, 3 Mar 2010 13:13:01 +1100
> > Subject: Re: [OMPI users] running external program on same  processor       
> > (Fortran)
> > 
> > Surely this is the problem of the scheduler that your system uses,
> > rather than MPI?
> > 
> > 
> > On Wed, 2010-03-03 at 00:48 +0000, abc def wrote:
> > > Hello,
> > > 
> > > I wonder if someone can help.
> > > 
> > > The situation is that I have an MPI-parallel fortran program. I run it
> > > and it's distributed on N cores, and each of these processes must call
> > > an external program.
> > > 
> > > This external program is also an MPI program, however I want to run it
> > > in serial, on the core that is calling it, as if it were part of the
> > > fortran program. The fortran program waits until the external program
> > > has completed, and then continues.
> > > 
> > > The problem is that this external program seems to run on any core,
> > > and not necessarily the (now idle) core that called it. This slows
> > > things down a lot as you get one core doing multiple tasks.
> > > 
> > > Can anyone tell me how I can call the program and ensure it runs only
> > > on the core that's calling it? Note that there are several cores per
> > > node. I can ID the node by running the hostname command (I don't know
> > > a way to do this for individual cores).
> > > 
> > > Thanks!
> > > 
> > > ====
> > > 
> > > Extra information that might be helpful:
> > > 
> > > If I simply run the external program from the command line (ie, type
> > > "/path/myprogram.ex <enter>"), it runs fine. If I run it within the
> > > fortran program by calling it via
> > > 
> > > CALL SYSTEM("/path/myprogram.ex")
> > > 
> > > it doesn't run at all (doesn't even start) and everything crashes. I
> > > don't know why this is.
> > > 
> > > If I call it using mpiexec:
> > > 
> > > CALL SYSTEM("mpiexec -n 1 /path/myprogram.ex")
> > > 
> > > then it does work, but I get the problem that it can go on any core. 
> > > 
> > > ______________________________________________________________________
> > > Do you want a Hotmail account? Sign-up now - Free
> > > _______________________________________________
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > 
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> Not got a Hotmail account? Sign-up now - Free 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> Not got a Hotmail account? Sign-up now - Free 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to