I ran a simple spawn test - you can find it in the OMPI code at orte/test/mpi/simple_spawn.c - and it worked fine: $ mpirun -n 2 ./simple_spawn [1858076673:0 pid 19909] starting up on node Ralphs-iMac-2.local! [1858076673:1 pid 19910] starting up on node Ralphs-iMac-2.local! 1 completed MPI_Init Parent [pid 19910] about to spawn! 0 completed MPI_Init Parent [pid 19909] about to spawn! [1858076674:0 pid 19911] starting up on node Ralphs-iMac-2.local! [1858076674:1 pid 19912] starting up on node Ralphs-iMac-2.local! [1858076674:2 pid 19913] starting up on node Ralphs-iMac-2.local! Parent done with spawn Parent sending message to child Parent done with spawn 2 completed MPI_Init Hello from the child 2 of 3 on host Ralphs-iMac-2.local pid 19913 1 completed MPI_Init Hello from the child 1 of 3 on host Ralphs-iMac-2.local pid 19912 0 completed MPI_Init Hello from the child 0 of 3 on host Ralphs-iMac-2.local pid 19911 Child 0 received msg: 38 Parent disconnected Parent disconnected Child 0 disconnected Child 1 disconnected Child 2 disconnected 19910: exiting 19911: exiting 19912: exiting 19913: exiting 19909: exiting $
I then ran our spawn_multiple test - again, you can find it at orte/test/mpi/spawn_multiple.c: $ mpirun -n 2 ./spawn_multiple Parent [pid 19946] about to spawn! Parent [pid 19947] about to spawn! Parent done with spawn Parent sending message to children Parent done with spawn Hello from the child 1 of 2 on host Ralphs-iMac-2.local pid 19949: argv[1] = bar Hello from the child 0 of 2 on host Ralphs-iMac-2.local pid 19948: argv[1] = foo Child 0 received msg: 38 Child 1 received msg: 38 Parent disconnected Child 1 disconnected Child 0 disconnected Parent disconnected $ How did you configure OMPI, and how were you running your example? > On Nov 28, 2018, at 9:33 AM, Kiker, Kathleen R <kathleen.r.ki...@lmco.com> > wrote: > > Good Afternoon, > > I’m trying to diagnose an issue I’ve been having with MPI_Comm_Spawn. When I > run the simple example program: > > #include "mpi.h" > #include <stdio.h> > #include <stdlib.h> > > int main( int argc, char *argv[] ) > { > int np[2] = { 1, 1 }; > int errcodes[2]; > MPI_Comm parentcomm, intercomm; > char *cmds[2] = { "spawn_example", "spawn_example" }; > MPI_Info infos[2] = { MPI_INFO_NULL, MPI_INFO_NULL }; > > MPI_Init( &argc, &argv ); > MPI_Comm_get_parent( &parentcomm ); > if (parentcomm == MPI_COMM_NULL) > { > /* Create 2 more processes - this example must be called > spawn_example.exe for this to work. */ > MPI_Comm_spawn_multiple( 2, cmds, MPI_ARGVS_NULL, np, infos, 0, > MPI_COMM_WORLD, &intercomm, errcodes ); > printf("I'm the parent.\n"); > } > else > { > printf("I'm the spawned.\n"); > } > fflush(stdout); > MPI_Finalize(); > return 0; > } > > I get the output: > > -------------------------------------------------------------------------- > It looks like MPI_INIT failed for some reason; your parallel process is > likely to abort. There are many reasons that a parallel process can > fail during MPI_INIT; some of which are due to configuration or environment > problems. This failure appears to be an internal failure; here's some > additional information (which may only be relevant to an Open MPI > developer): > > ompi_dpm_dyn_init() failed > --> Returned "Unreachable" (-12) instead of "Success" (0) > -------------------------------------------------------------------------- > > I’m using OpenMPI 3.1.1. I know past versions (like 2.x) had a similar issue, > but I believe those were fixed by this version. Is there something else that > can cause this? > > Thank you, > Kathleen > _______________________________________________ > users mailing list > users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> > https://lists.open-mpi.org/mailman/listinfo/users > <https://lists.open-mpi.org/mailman/listinfo/users>
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users