On May 28, 2014, at 7:50 AM, Gilles Gouaillardet <gilles.gouaillar...@gmail.com> wrote:
> Ralph, > > thanks for the info > >> can you detail your full mpirun command line, the number of servers you are >> using, the btl involved and the ompi release that can be used to reproduce >> the issue ? > > Running on only one server, using the current head of the svn repo. My > cluster only has Ethernet, and I let it freely choose the BTLs (so I imagine > the candidates are sm,self,tcp,vader). The cmd line is really trivial: > > > is MPSS installed and loaded ? > if yes, scif is also a candidate Nope - not on this machine > > mpirun -n 1 ./loop_spawn > > I modified loop_spawn to only run 100 iterations as I am not patient enough > to wait for 1000, and the number of iters isn't a factor so long as it is > greater than 1. When the parent calls finalize, I get one of the following > emitted for every iteration that was done: > > dpm_base_disconnect_init: error -12 in isend to process 0 > > > so we do the same thing but have different behaviour ... > > just to be sure : > are we talking about the loop_spawn test from the ibm test suite available at > http://svn.open-mpi.org/svn/ompi-tests/trunk/ibm/dynamic/loop_spawn.c > and > http://svn.open-mpi.org/svn/ompi-tests/trunk/ibm/dynamic/loop_child.c > > number of iterations is 2000 (and not 1000) > MPI_Comm_disconnect is invoked by both parent in loop_spawn.c : > MPI_Comm_free(&comm_merged); > MPI_Comm_disconnect(&comm_spawned); > > and children in loop_child.c : > MPI_Comm_free(&merged); > MPI_Comm_disconnect(&parent); > > is there any possibility you are running a different test called loop_spawn > or an older version of the dynamic/loop_spawn test from the ibm test suite ? Yeah, I'm running a version that was the parent of that one. Looks like it has diverged, so perhaps that is the issue. Let me refresh it and try again. > > Cheers, > > Gilles > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/05/14894.php