Re: [OMPI users] Spawn problem
Hi again, when I call MPI_Init_thread in the same program the error is: spawning ... opal_mutex_lock(): Resource deadlock avoided [localhost:07566] *** Process received signal *** [localhost:07566] Signal: Aborted (6) [localhost:07566] Signal code: (-6) [localhost:07566] [ 0] /lib/libpthread.so.0 [0x2abe5630ded0] [localhost:07566] [ 1] /lib/libc.so.6(gsignal+0x35) [0x2abe5654c3c5] [localhost:07566] [ 2] /lib/libc.so.6(abort+0x10e) [0x2abe5654d73e] [localhost:07566] [ 3] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe5528063b] [localhost:07566] [ 4] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe55280559] [localhost:07566] [ 5] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe552805e8] [localhost:07566] [ 6] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe55280fff] [localhost:07566] [ 7] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe55280f3d] [localhost:07566] [ 8] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe55281f59] [localhost:07566] [ 9] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(ompi_proc_unpack+ 0x204) [0x2abe552823cd] [localhost:07566] [10] /usr/local/mpi/ompi-svn/lib/openmpi/mca_dpm_orte.so [0x2a be58efb5f7] [localhost:07566] [11] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(MPI_Comm_spawn+0x 465) [0x2abe552b55cd] [localhost:07566] [12] ./spawn1(main+0x9d) [0x400b05] [localhost:07566] [13] /lib/libc.so.6(__libc_start_main+0xf4) [0x2abe56539b74] [localhost:07566] [14] ./spawn1 [0x4009d9] [localhost:07566] *** End of error message *** opal_mutex_lock(): Resource deadlock avoided [localhost:07567] *** Process received signal *** [localhost:07567] Signal: Aborted (6) [localhost:07567] Signal code: (-6) [localhost:07567] [ 0] /lib/libpthread.so.0 [0x2b48610f9ed0] [localhost:07567] [ 1] /lib/libc.so.6(gsignal+0x35) [0x2b48613383c5] [localhost:07567] [ 2] /lib/libc.so.6(abort+0x10e) [0x2b486133973e] [localhost:07567] [ 3] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006c63b] [localhost:07567] [ 4] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006c559] [localhost:07567] [ 5] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006c5e8] [localhost:07567] [ 6] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006cfff] [localhost:07567] [ 7] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006cf3d] [localhost:07567] [ 8] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006df59] [localhost:07567] [ 9] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(ompi_proc_unpack+ 0x204) [0x2b486006e3cd] [localhost:07567] [10] /usr/local/mpi/ompi-svn/lib/openmpi/mca_dpm_orte.so [0x2b 4863ce75f7] [localhost:07567] [11] /usr/local/mpi/ompi-svn/lib/openmpi/mca_dpm_orte.so [0x2b 4863ce9c2b] [localhost:07567] [12] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b48600720d7] [localhost:07567] [13] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(PMPI_Init_thread+ 0x166) [0x2b48600ae4f2] [localhost:07567] [14] ./spawn1(main+0x2c) [0x400a94] [localhost:07567] [15] /lib/libc.so.6(__libc_start_main+0xf4) [0x2b4861325b74] [localhost:07567] [16] ./spawn1 [0x4009d9] [localhost:07567] *** End of error message *** -- mpirun noticed that process rank 0 with PID 7566 on node localhost exited on sig nal 6 (Aborted). -- thank for some check, Joao. On Mon, Mar 31, 2008 at 11:49 AM, Joao Vicente Lima wrote: > Really MPI_Finalize is crashing and calling MPI_Comm_{free,disconnect} works! > I don't know if the free/disconnect must appear before a MPI_Finalize > for this case (spawn processes) some suggest ? > > I use loops in spawn: > - first for testing :) > - and second because certain MPI applications don't know in advance > the number of childrens needed to complete his work. > > The spawn works is creat ... I will made other tests. > > thanks, > Joao > > > > On Mon, Mar 31, 2008 at 3:03 AM, Matt Hughes > wrote: > > On 30/03/2008, Joao Vicente Lima wrote: > > > Hi, > > > sorry bring this again ... but i hope use spawn in ompi someday :-D > > > > I believe it's crashing in MPI_Finalize because you have not closed > > all communication paths between the parent and the child processes. > > For the parent process, try calling MPI_Comm_free or > > MPI_Comm_disconnect on each intercomm in your intercomm array before > > calling finalize. On the child, call free or disconnect on the parent > > intercomm before calling finalize. > > > > Out of curiosity, why a loop of spawns? Why not increase the value of > > the maxprocs argument, or if you need to spawn different executables, > > or use different arguments for each instance, why not > > MPI_Comm_spawn_multiple? > > > > mch > > > > > > > > > > > > > > > > The execution of spawn in this way works fine
Re: [OMPI users] Spawn problem
Really MPI_Finalize is crashing and calling MPI_Comm_{free,disconnect} works! I don't know if the free/disconnect must appear before a MPI_Finalize for this case (spawn processes) some suggest ? I use loops in spawn: - first for testing :) - and second because certain MPI applications don't know in advance the number of childrens needed to complete his work. The spawn works is creat ... I will made other tests. thanks, Joao On Mon, Mar 31, 2008 at 3:03 AM, Matt Hughes wrote: > On 30/03/2008, Joao Vicente Lima wrote: > > Hi, > > sorry bring this again ... but i hope use spawn in ompi someday :-D > > I believe it's crashing in MPI_Finalize because you have not closed > all communication paths between the parent and the child processes. > For the parent process, try calling MPI_Comm_free or > MPI_Comm_disconnect on each intercomm in your intercomm array before > calling finalize. On the child, call free or disconnect on the parent > intercomm before calling finalize. > > Out of curiosity, why a loop of spawns? Why not increase the value of > the maxprocs argument, or if you need to spawn different executables, > or use different arguments for each instance, why not > MPI_Comm_spawn_multiple? > > mch > > > > > > > > > The execution of spawn in this way works fine: > > MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 2, MPI_INFO_NULL, 0, > > MPI_COMM_SELF, &intercomm, MPI_ERRCODES_IGNORE); > > > > but if this code go to a for I get a problem : > > for (i= 0; i < 2; i++) > > { > > MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 1, > > MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm[i], MPI_ERRCODES_IGNORE); > > } > > > > and the error is: > > spawning ... > > child! > > child! > > [localhost:03892] *** Process received signal *** > > [localhost:03892] Signal: Segmentation fault (11) > > [localhost:03892] Signal code: Address not mapped (1) > > [localhost:03892] Failing at address: 0xc8 > > [localhost:03892] [ 0] /lib/libpthread.so.0 [0x2ac71ca8bed0] > > [localhost:03892] [ 1] > > /usr/local/mpi/ompi-svn/lib/libmpi.so.0(ompi_dpm_base_dyn_finalize+0xa3) > > [0x2ac71ba7448c] > > [localhost:03892] [ 2] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 > [0x2ac71b9decdf] > > [localhost:03892] [ 3] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 > [0x2ac71ba04765] > > [localhost:03892] [ 4] > > /usr/local/mpi/ompi-svn/lib/libmpi.so.0(PMPI_Finalize+0x71) > > [0x2ac71ba365c9] > > [localhost:03892] [ 5] ./spawn1(main+0xaa) [0x400ac2] > > [localhost:03892] [ 6] /lib/libc.so.6(__libc_start_main+0xf4) > [0x2ac71ccb7b74] > > [localhost:03892] [ 7] ./spawn1 [0x400989] > > [localhost:03892] *** End of error message *** > > -- > > mpirun noticed that process rank 0 with PID 3892 on node localhost > > exited on signal 11 (Segmentation fault). > > -- > > > > the attachments contain the ompi_info, config.log and program. > > > > thanks for some check, > > > > Joao. > > > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
[OMPI users] Spawn problem
Hi, sorry bring this again ... but i hope use spawn in ompi someday :-D The execution of spawn in this way works fine: MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 2, MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm, MPI_ERRCODES_IGNORE); but if this code go to a for I get a problem : for (i= 0; i < 2; i++) { MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 1, MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm[i], MPI_ERRCODES_IGNORE); } and the error is: spawning ... child! child! [localhost:03892] *** Process received signal *** [localhost:03892] Signal: Segmentation fault (11) [localhost:03892] Signal code: Address not mapped (1) [localhost:03892] Failing at address: 0xc8 [localhost:03892] [ 0] /lib/libpthread.so.0 [0x2ac71ca8bed0] [localhost:03892] [ 1] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(ompi_dpm_base_dyn_finalize+0xa3) [0x2ac71ba7448c] [localhost:03892] [ 2] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2ac71b9decdf] [localhost:03892] [ 3] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2ac71ba04765] [localhost:03892] [ 4] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(PMPI_Finalize+0x71) [0x2ac71ba365c9] [localhost:03892] [ 5] ./spawn1(main+0xaa) [0x400ac2] [localhost:03892] [ 6] /lib/libc.so.6(__libc_start_main+0xf4) [0x2ac71ccb7b74] [localhost:03892] [ 7] ./spawn1 [0x400989] [localhost:03892] *** End of error message *** -- mpirun noticed that process rank 0 with PID 3892 on node localhost exited on signal 11 (Segmentation fault). -- the attachments contain the ompi_info, config.log and program. thanks for some check, Joao. config.log.gz Description: GNU Zip compressed data ompi_info.txt.gz Description: GNU Zip compressed data spawn1.c.gz Description: GNU Zip compressed data
[OMPI users] MPI_Comm_spawn errors
Hi all, I'm getting errors with spawn in the situations: 1) spawn1.c - spawning 2 process on localhost, one by one, the error is: spawning ... [localhost:31390] *** Process received signal *** [localhost:31390] Signal: Segmentation fault (11) [localhost:31390] Signal code: Address not mapped (1) [localhost:31390] Failing at address: 0x98 [localhost:31390] [ 0] /lib/libpthread.so.0 [0x2b1d38a17ed0] [localhost:31390] [ 1] /usr/local/mpi/openmpi-svn/lib/libmpi.so.0(ompi_comm_dyn_finalize+0xd2) [0x2b1d37667cb2] [localhost:31390] [ 2] /usr/local/mpi/openmpi-svn/lib/libmpi.so.0(ompi_comm_finalize+0x3b) [0x2b1d3766358b] [localhost:31390] [ 3] /usr/local/mpi/openmpi-svn/lib/libmpi.so.0(ompi_mpi_finalize+0x248) [0x2b1d37679598] [localhost:31390] [ 4] ./spawn1(main+0xac) [0x400ac4] [localhost:31390] [ 5] /lib/libc.so.6(__libc_start_main+0xf4) [0x2b1d38c43b74] [localhost:31390] [ 6] ./spawn1 [0x400989] [localhost:31390] *** End of error message *** -- mpirun has exited due to process rank 0 with PID 31390 on node localhost calling "abort". This will have caused other processes in the application to be terminated by signals sent by mpirun (as reported here). -- With 1 process spawned or with 2 process spawned in one call there is no output from child. 2) spawn2.c - no response, this init is MPI_Init_thread (&argc, &argv, MPI_THREAD_MULTIPLE, &required) the attachments contains the programs, ompi_info and config.log. Some suggest ? thanks a lot. Joao. spawn1.c.gz Description: GNU Zip compressed data spawn2.c.gz Description: GNU Zip compressed data ompi_info.txt.gz Description: GNU Zip compressed data config.log.gz Description: GNU Zip compressed data
[OMPI users] init_thread + spawn error
Hi all! I'm getting a error on call MPI_Init_thread and MPI_Comm_spawn. am I mistaking something? the attachments contains my ompi_info and source ... thank! Joao char *arg[]= {"spawn1", (char *)0}; MPI_Init_thread (&argc, &argv, MPI_THREAD_MULTIPLE, &provided); MPI_Comm_spawn ("./spawn_slave", arg, 1, MPI_INFO_NULL, 0, MPI_COMM_SELF, &slave, MPI_ERRCODES_IGNORE); . and the error: opal_mutex_lock(): Resource deadlock avoided [c8:13335] *** Process received signal *** [c8:13335] Signal: Aborted (6) [c8:13335] Signal code: (-6) [c8:13335] [ 0] [0xb7fbf440] [c8:13335] [ 1] /lib/libc.so.6(abort+0x101) [0xb7abd5b1] [c8:13335] [ 2] /usr/local/openmpi/openmpi-svn/lib/libmpi.so.0 [0xb7e2933c] [c8:13335] [ 3] /usr/local/openmpi/openmpi-svn/lib/libmpi.so.0 [0xb7e2923a] [c8:13335] [ 4] /usr/local/openmpi/openmpi-svn/lib/libmpi.so.0 [0xb7e292e3] [c8:13335] [ 5] /usr/local/openmpi/openmpi-svn/lib/libmpi.so.0 [0xb7e29fa7] [c8:13335] [ 6] /usr/local/openmpi/openmpi-svn/lib/libmpi.so.0 [0xb7e29eda] [c8:13335] [ 7] /usr/local/openmpi/openmpi-svn/lib/libmpi.so.0 [0xb7e2adec] [c8:13335] [ 8] /usr/local/openmpi/openmpi-svn/lib/libmpi.so.0(ompi_proc_unpack+ 0x181) [0xb7e2b142] [c8:13335] [ 9] /usr/local/openmpi/openmpi-svn/lib/libmpi.so.0(ompi_comm_connect _accept+0x57c) [0xb7e0fb70] [c8:13335] [10] /usr/local/openmpi/openmpi-svn/lib/libmpi.so.0(PMPI_Comm_spawn+0 x395) [0xb7e5e285] [c8:13335] [11] ./spawn(main+0x7f) [0x80486ef] [c8:13335] [12] /lib/libc.so.6(__libc_start_main+0xdc) [0xb7aa7ebc] [c8:13335] [13] ./spawn [0x80485e1] [c8:13335] *** End of error message *** -- mpirun has exited due to process rank 0 with PID 13335 on node c8 calling "abort". This will have caused other processes in the application to be terminated by signals sent by mpirun (as reported here). -- #include "mpi.h" #include int main (int argc, char **argv) { int provided; MPI_Comm slave; char *arg[]= {"spawn1", (char *)0}; MPI_Init_thread (&argc, &argv, MPI_THREAD_MULTIPLE, &provided); MPI_Comm_spawn ("./spawn_slave", arg, 1, MPI_INFO_NULL, 0, MPI_COMM_SELF, &slave, MPI_ERRCODES_IGNORE); MPI_Finalize (); return 0; } Open MPI: 1.3a1r16236 Open MPI SVN revision: r16236 Open RTE: 1.3a1r16236 Open RTE SVN revision: r16236 OPAL: 1.3a1r16236 OPAL SVN revision: r16236 Prefix: /usr/local/openmpi/openmpi-svn Configured architecture: i686-pc-linux-gnu Configure host: corisco Configured by: lima Configured on: Wed Sep 26 11:37:04 BRT 2007 Configure host: corisco Built by: lima Built on: Wed Sep 26 12:07:13 BRT 2007 Built host: corisco C bindings: yes C++ bindings: yes Fortran77 bindings: yes (all) Fortran90 bindings: no Fortran90 bindings size: na C compiler: gcc C compiler absolute: /usr/bin/gcc C++ compiler: g++ C++ compiler absolute: /usr/bin/g++ Fortran77 compiler: g77 Fortran77 compiler abs: /usr/bin/g77 Fortran90 compiler: none Fortran90 compiler abs: none C profiling: yes C++ profiling: yes Fortran77 profiling: yes Fortran90 profiling: no C++ exceptions: no Thread support: posix (mpi: yes, progress: no) Sparse Groups: no Internal debug support: yes MPI parameter check: runtime Memory profiling support: yes Memory debugging support: yes libltdl support: yes Heterogeneous support: yes mpirun default --prefix: no MPI I/O support: yes MCA backtrace: execinfo (MCA v1.0, API v1.0, Component v1.3) MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.3) MCA paffinity: linux (MCA v1.0, API v1.1, Component v1.3) MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.3) MCA timer: linux (MCA v1.0, API v1.0, Component v1.3) MCA installdirs: env (MCA v1.0, API v1.0, Component v1.3) MCA installdirs: config (MCA v1.0, API v1.0, Component v1.3) MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0) MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0) MCA coll: basic (MCA v1.0, API v1.0, Component v1.3) MCA coll: inter (MCA v1.0, API v1.0, Component v1.3) MCA coll: self (MCA v1.0, API v1.0, Component v1.3) MCA coll: sm (MCA v1.0, API v1.0, Component v1.3) MCA coll: tuned (MCA v1.0, API v1.0, Component v1.3) MCA io: romio (MCA v1.0, API v1.0, Component v1.3) MCA mpool: rdma (MCA v1.0, API v1.0, Component v1.3) MCA mpool: sm (MCA v1.0, API v1.0, Component v1.3)