Hangs with any np > 1 However, I'm not sure if that's an issue with the test vs the underlying implementation
On Sep 18, 2013, at 7:40 AM, "Jeff Squyres (jsquyres)" <jsquy...@cisco.com> wrote: > Does it hang when you run with -np 4? > > Sent from my phone. No type good. > > On Sep 18, 2013, at 4:10 PM, "Ralph Castain" <r...@open-mpi.org> wrote: > >> Strange - it works fine for me on my Mac. However, I see one difference - I >> only run it with np=1 >> >> On Sep 18, 2013, at 2:22 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> >> wrote: >> >>> On Sep 18, 2013, at 9:33 AM, George Bosilca <bosi...@icl.utk.edu> wrote: >>> >>>> 1. sm doesn't work between spawned processes. So you must have another >>>> network enabled. >>> >>> I know :-). I have tcp available as well (OMPI will abort if you only run >>> with sm,self because the comm_spawn will fail with unreachable errors -- I >>> just tested/proved this to myself). >>> >>>> 2. Don't use the test case attached to my email, I left an xterm based >>>> spawn and the debugging. It can't work without xterm support. Instead try >>>> using the test case from the trunk, the one committed by Ralph. >>> >>> I didn't see any "xterm" strings in there, but ok. :-) I ran with >>> orte/test/mpi/intercomm_create.c, and that hangs for me as well: >>> >>> ----- >>> ❯❯❯ mpicc intercomm_create.c -o intercomm_create >>> ❯❯❯ mpirun -np 4 intercomm_create >>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank >>> 4] >>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank >>> 5] >>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank >>> 6] >>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank >>> 7] >>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 4] >>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 5] >>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 6] >>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 7] >>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0) >>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0) >>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0) >>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0) >>> [hang] >>> ----- >>> >>> Similarly, on my Mac, it hangs with no output: >>> >>> ----- >>> ❯❯❯ mpicc intercomm_create.c -o intercomm_create >>> ❯❯❯ mpirun -np 4 intercomm_create >>> [hang] >>> ----- >>> >>>> George. >>>> >>>> On Sep 18, 2013, at 07:53 , "Jeff Squyres (jsquyres)" <jsquy...@cisco.com> >>>> wrote: >>>> >>>>> George -- >>>>> >>>>> When I build the SVN trunk (r29201) on 64 bit linux, your attached test >>>>> case hangs: >>>>> >>>>> ----- >>>>> ❯❯❯ mpicc intercomm_create.c -o intercomm_create >>>>> ❯❯❯ mpirun -np 4 intercomm_create >>>>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) >>>>> [rank 4] >>>>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) >>>>> [rank 5] >>>>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) >>>>> [rank 6] >>>>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) >>>>> [rank 7] >>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0) >>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0) >>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0) >>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, &inter) (0) >>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank >>>>> 4] >>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank >>>>> 5] >>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank >>>>> 6] >>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank >>>>> 7] >>>>> [hang] >>>>> ----- >>>>> >>>>> On my Mac, it hangs without printing anything: >>>>> >>>>> ----- >>>>> ❯❯❯ mpicc intercomm_create.c -o intercomm_create >>>>> ❯❯❯ mpirun -np 4 intercomm_create >>>>> [hang] >>>>> ----- >>>>> >>>>> >>>>> On Sep 18, 2013, at 1:48 AM, George Bosilca <bosi...@icl.utk.edu> wrote: >>>>> >>>>>> Here is a quick (and definitively not the cleanest) patch that addresses >>>>>> the MPI_Intercomm issue at the MPI level. It should be applied after >>>>>> removal of 29166. >>>>>> >>>>>> I also added the corrected test case stressing the corner cases by doing >>>>>> barriers at every inter-comm creation and doing a clean disconnect. >>>>> >>>>> >>>>> -- >>>>> Jeff Squyres >>>>> jsquy...@cisco.com >>>>> For corporate legal information go to: >>>>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>>>> >>>>> _______________________________________________ >>>>> devel mailing list >>>>> de...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> >>> -- >>> Jeff Squyres >>> jsquy...@cisco.com >>> For corporate legal information go to: >>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel