We expect to have much better support for the entire comm_spawn process
in the next incarnation of the RTE. I don't expect that to be included
in a release, however, until 1.1 (Jeff may be able to give you an
estimate for when that will happen). Jeff et al may be able to give you access to an early non-release version sooner, if better comm_spawn support is a critical issue and you don't mind being patient with the inevitable bugs in such versions. Ralph Edgar Gabriel wrote: Open MPI currently does not fully support a proper disconnection of parent and child processes. Thus, if a child dies/aborts, the parents will abort as well, despite of calling MPI_Comm_disconnect. (The new RTE will have better support for these operations, Ralph/Jeff can probably give a better estimate when this will be available.)However, what should not happen is, that if the child calls MPI_Finalize (so not a violent death but a proper shutdown), the parent goes down at the same time. Let me check that as well... Brignone, Sergio wrote:Hi everybody, I am trying to run a master/slave set. Because of the nature of the problem I need to start and stop (kill) some slaves. The problem is that as soon as one of the slave dies, the master dies also. This is what I am doing: MASTER: MPI_Init(...) MPI_Comm_spawn(slave1,...,nslave1,...,intercomm1); MPI_Barrier(intercomm1); MPI_Comm_disconnect(&intercomm1); MPI_Comm_spawn(slave2,...,nslave2,...,intercomm2); MPI_Barrier(intercomm2); MPI_Comm_disconnect(&intercomm2); MPI_Finalize(); SLAVE: MPI_Init(...) MPI_Comm_get_parent(&intercomm); (does something) MPI_Barrier(intercomm); MPI_Comm_disconnect(&intercomm); MPI_Finalize(); The issue is that as soon as the first set of slaves calls MPI_Finalize, the master dies also (it dies right after MPI_Comm_disconnect(&intercomm1) ) What am I doing wrong? Thanks Sergio ------------------------------------------------------------------------ _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users_______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users |
- [OMPI users] Spawn and Disconnect Brignone, Sergio
- Re: [OMPI users] Spawn and Disconnect Edgar Gabriel
- Re: [OMPI users] Spawn and Disconnect Ralph Castain
- Re: [OMPI users] Spawn and Disconnect Jean Latour
- [OMPI users] Spawn and Disconnect Michael Kluskens
- Re: [OMPI users] Spawn and Disconnect Michael Kluskens
- Re: [OMPI users] Spawn and Disconnect Brignone, Sergio
- Re: [OMPI users] Spawn and Disconnect Edgar Gabriel