On May 28, 2014, at 4:45 AM, Gilles Gouaillardet <gilles.gouaillar...@gmail.com> wrote:
> Jeff, > > On Wed, May 28, 2014 at 8:31 PM, Jeff Squyres (jsquyres) > > To be totally clear: MPI says it is erroneous for only some (not all) > > processes in a communicator to call MPI_COMM_FREE. So if that's the real > > problem, then the discussion about why the parent(s) is(are) trying to > > contact the children is moot -- the test is erroneous, and erroneous > > application behavior is undefined. > > This is definetly what happens : only some tasks call MPI_Comm_free() Really? I don't see how that can happen in loop_spawn - every process is clearly calling comm_free. Or are you referring to the intercomm_create test? > i will commit my changes and the initially reported issue is solved :-) > > > > about the "bonus points" : > > v1.8 does not have this issue > > i digged it and bottom line, the parent (who did not call MPI_Comm_free > unlike the children) I see the parent doing it in every loop: MPI_Init( &argc, &argv); for (iter = 0; iter < 1000; ++iter) { MPI_Comm_spawn(EXE_TEST, NULL, 1, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &comm, &err); printf("parent: MPI_Comm_spawn #%d return : %d\n", iter, err); MPI_Intercomm_merge(comm, 0, &merged); MPI_Comm_rank(merged, &rank); MPI_Comm_size(merged, &size); printf("parent: MPI_Comm_spawn #%d rank %d, size %d\n", iter, rank, size); MPI_Comm_free(&merged); } MPI_Finalize(); I suspect that you are talking about intercomm_create, hence my confusion. > calls ompi_dpm_base_dyn_finalize, which tries to isend the already exited > tasks. > > > bottom line, in pml_ob1_sendreq.h line 450 > > with v1,8 > mca_bml_base_btl_array_get_size(&endpoint->btl_eager) = 0 > nothing is sent but isend is reported successful > > with trunk > mca_bml_base_btl_array_get_size(&endpoint->btl_eager) = 1 > and then try to send the message => BOUM > > i found various things that seem counter intuitive to me and will summarize > all this tomorrow. > > Cheers, > > Gilles > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/05/14884.php