Open MPI currently does not fully support a proper disconnection of parent and child processes. Thus, if a child dies/aborts, the parents will abort as well, despite of calling MPI_Comm_disconnect. (The new RTE will have better support for these operations, Ralph/Jeff can probably give a better estimate when this will be available.)

However, what should not happen is, that if the child calls MPI_Finalize (so not a violent death but a proper shutdown), the parent goes down at the same time. Let me check that as well...

Brignone, Sergio wrote:

Hi everybody,

I am trying to run a master/slave set.

Because of the nature of the problem I need to start and stop (kill) some slaves.

The problem is that as soon as one of the slave dies, the master dies also.

This is what I am doing:

MASTER:

MPI_Init(...)

MPI_Comm_spawn(slave1,...,nslave1,...,intercomm1);

MPI_Barrier(intercomm1);

MPI_Comm_disconnect(&intercomm1);

MPI_Comm_spawn(slave2,...,nslave2,...,intercomm2);

MPI_Barrier(intercomm2);

MPI_Comm_disconnect(&intercomm2);

MPI_Finalize();

SLAVE:

MPI_Init(...)

MPI_Comm_get_parent(&intercomm);

(does something)

MPI_Barrier(intercomm);

MPI_Comm_disconnect(&intercomm);

 MPI_Finalize();

The issue is that as soon as the first set of slaves calls MPI_Finalize, the master dies also (it dies right after MPI_Comm_disconnect(&intercomm1) )

What am I doing wrong?

Thanks

Sergio


------------------------------------------------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to