Re: [OMPI users] OMPI users] OMPI users] OMPI users] MPI inside MPI (still)

2014-12-13 Thread Gilles Gouaillardet
Alex, Are you calling MPI_Comm_disconnect in the 3 "master" tasks and with the same remote communicator ? I also read the man page again, and MPI_Comm_disconnect does not ensure the remote processes have finished or called MPI_Comm_disconnect, so that might not be the thing you need. George, c

Re: [OMPI users] OMPI users] OMPI users] MPI inside MPI (still)

2014-12-13 Thread George Bosilca
MPI_Comm_disconnect should be a local operation, there is no reason for it to deadlock. I looked at the code and everything is local with the exception of a call to PMIX.FENCE. Can you attach to your deadlocked processes and confirm that they are stopped in the pmix.fence? George. On Sat, Dec

Re: [OMPI users] OMPI users] OMPI users] MPI inside MPI (still)

2014-12-13 Thread Alex A. Schmidt
Hi Sorry, I was calling mpi_comm_disconnect on the group comm handler, not on the intercomm handler returned from the spawn call as it should be. Well, calling the disconnect on the intercomm handler does halt the spwaner side but the wait is never completed since, as George points out, there is

Re: [OMPI users] OMPI users] OMPI users] MPI inside MPI (still)

2014-12-13 Thread Gilles Gouaillardet
George is right about the semantic However i am surprised it returns immediatly... That should either work or hang imho The second point is no more mpi related, and is batch manager specific. You will likely find a submit parameter to make the command block until the job completes. Or you can w

Re: [OMPI users] OMPI users] MPI inside MPI (still)

2014-12-13 Thread Reuti
Hi, Am 13.12.2014 um 02:43 schrieb Alex A. Schmidt: > MPI_comm_disconnect seem to work but not quite. > The call to it returns almost immediatly while > the spawn processes keep piling up in the background > until they are all done... > > I think system('env -i qsub...') to launch the third part

Re: [OMPI users] OMPI users] MPI inside MPI (still)

2014-12-13 Thread George Bosilca
You have to call MPI_Comm_disconnect on both sides of the intercommunicator. On the spawner processes you should call it on the intercom, while on the spawnees you should call it on the MPI_Comm_get_parent. George. > On Dec 12, 2014, at 20:43 , Alex A. Schmidt wrote: > > Gilles, > > MPI_co