Re: [OMPI users] Spawn problem

2008-03-31 Thread Joao Vicente Lima
Hi again,
when I call MPI_Init_thread in the same program the error is:

spawning ...
opal_mutex_lock(): Resource deadlock avoided
[localhost:07566] *** Process received signal ***
[localhost:07566] Signal: Aborted (6)
[localhost:07566] Signal code:  (-6)
[localhost:07566] [ 0] /lib/libpthread.so.0 [0x2abe5630ded0]
[localhost:07566] [ 1] /lib/libc.so.6(gsignal+0x35) [0x2abe5654c3c5]
[localhost:07566] [ 2] /lib/libc.so.6(abort+0x10e) [0x2abe5654d73e]
[localhost:07566] [ 3] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe5528063b]
[localhost:07566] [ 4] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe55280559]
[localhost:07566] [ 5] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe552805e8]
[localhost:07566] [ 6] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe55280fff]
[localhost:07566] [ 7] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe55280f3d]
[localhost:07566] [ 8] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe55281f59]
[localhost:07566] [ 9] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(ompi_proc_unpack+
0x204) [0x2abe552823cd]
[localhost:07566] [10] /usr/local/mpi/ompi-svn/lib/openmpi/mca_dpm_orte.so [0x2a
be58efb5f7]
[localhost:07566] [11] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(MPI_Comm_spawn+0x
465) [0x2abe552b55cd]
[localhost:07566] [12] ./spawn1(main+0x9d) [0x400b05]
[localhost:07566] [13] /lib/libc.so.6(__libc_start_main+0xf4) [0x2abe56539b74]
[localhost:07566] [14] ./spawn1 [0x4009d9]
[localhost:07566] *** End of error message ***
opal_mutex_lock(): Resource deadlock avoided
[localhost:07567] *** Process received signal ***
[localhost:07567] Signal: Aborted (6)
[localhost:07567] Signal code:  (-6)
[localhost:07567] [ 0] /lib/libpthread.so.0 [0x2b48610f9ed0]
[localhost:07567] [ 1] /lib/libc.so.6(gsignal+0x35) [0x2b48613383c5]
[localhost:07567] [ 2] /lib/libc.so.6(abort+0x10e) [0x2b486133973e]
[localhost:07567] [ 3] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006c63b]
[localhost:07567] [ 4] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006c559]
[localhost:07567] [ 5] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006c5e8]
[localhost:07567] [ 6] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006cfff]
[localhost:07567] [ 7] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006cf3d]
[localhost:07567] [ 8] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006df59]
[localhost:07567] [ 9] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(ompi_proc_unpack+
0x204) [0x2b486006e3cd]
[localhost:07567] [10] /usr/local/mpi/ompi-svn/lib/openmpi/mca_dpm_orte.so [0x2b
4863ce75f7]
[localhost:07567] [11] /usr/local/mpi/ompi-svn/lib/openmpi/mca_dpm_orte.so [0x2b
4863ce9c2b]
[localhost:07567] [12] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b48600720d7]
[localhost:07567] [13] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(PMPI_Init_thread+
0x166) [0x2b48600ae4f2]
[localhost:07567] [14] ./spawn1(main+0x2c) [0x400a94]
[localhost:07567] [15] /lib/libc.so.6(__libc_start_main+0xf4) [0x2b4861325b74]
[localhost:07567] [16] ./spawn1 [0x4009d9]
[localhost:07567] *** End of error message ***
--
mpirun noticed that process rank 0 with PID 7566 on node localhost exited on sig
nal 6 (Aborted).
--

thank for some check,
Joao.

On Mon, Mar 31, 2008 at 11:49 AM, Joao Vicente Lima
 wrote:
> Really MPI_Finalize is crashing and calling MPI_Comm_{free,disconnect} works!
>  I don't know if the free/disconnect must appear before a MPI_Finalize
>  for this case (spawn processes)   some suggest ?
>
>  I use loops in spawn:
>  -  first for testing :)
>  - and second because certain MPI applications don't know in advance
>  the number of childrens needed to complete his work.
>
>  The spawn works is creat ... I will made other tests.
>
>  thanks,
>  Joao
>
>
>
>  On Mon, Mar 31, 2008 at 3:03 AM, Matt Hughes
>   wrote:
>  > On 30/03/2008, Joao Vicente Lima  wrote:
>  >  > Hi,
>  >  >  sorry bring this again ... but i hope use spawn in ompi someday :-D
>  >
>  >  I believe it's crashing in MPI_Finalize because you have not closed
>  >  all communication paths between the parent and the child processes.
>  >  For the parent process, try calling MPI_Comm_free or
>  >  MPI_Comm_disconnect on each intercomm in your intercomm array before
>  >  calling finalize.  On the child, call free or disconnect on the parent
>  >  intercomm before calling finalize.
>  >
>  >  Out of curiosity, why a loop of spawns?  Why not increase the value of
>  >  the maxprocs argument, or if you need to spawn different executables,
>  >  or use different arguments for each instance, why not
>  >  MPI_Comm_spawn_multiple?
>  >
>  >  mch
>  >
>  >
>  >
>  >
>  >
>  >  >
>  >  >  The execution of spawn in this way works fine

Re: [OMPI users] Spawn problem

2008-03-31 Thread Joao Vicente Lima
Really MPI_Finalize is crashing and calling MPI_Comm_{free,disconnect} works!
I don't know if the free/disconnect must appear before a MPI_Finalize
for this case (spawn processes)   some suggest ?

I use loops in spawn:
-  first for testing :)
- and second because certain MPI applications don't know in advance
the number of childrens needed to complete his work.

The spawn works is creat ... I will made other tests.

thanks,
Joao

On Mon, Mar 31, 2008 at 3:03 AM, Matt Hughes
 wrote:
> On 30/03/2008, Joao Vicente Lima  wrote:
>  > Hi,
>  >  sorry bring this again ... but i hope use spawn in ompi someday :-D
>
>  I believe it's crashing in MPI_Finalize because you have not closed
>  all communication paths between the parent and the child processes.
>  For the parent process, try calling MPI_Comm_free or
>  MPI_Comm_disconnect on each intercomm in your intercomm array before
>  calling finalize.  On the child, call free or disconnect on the parent
>  intercomm before calling finalize.
>
>  Out of curiosity, why a loop of spawns?  Why not increase the value of
>  the maxprocs argument, or if you need to spawn different executables,
>  or use different arguments for each instance, why not
>  MPI_Comm_spawn_multiple?
>
>  mch
>
>
>
>
>
>  >
>  >  The execution of spawn in this way works fine:
>  >  MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 2, MPI_INFO_NULL, 0,
>  >  MPI_COMM_SELF, &intercomm, MPI_ERRCODES_IGNORE);
>  >
>  >  but if this code go to a for I get a problem :
>  >  for (i= 0; i < 2; i++)
>  >  {
>  >   MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 1,
>  >   MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm[i], MPI_ERRCODES_IGNORE);
>  >  }
>  >
>  >  and the error is:
>  >  spawning ...
>  >  child!
>  >  child!
>  >  [localhost:03892] *** Process received signal ***
>  >  [localhost:03892] Signal: Segmentation fault (11)
>  >  [localhost:03892] Signal code: Address not mapped (1)
>  >  [localhost:03892] Failing at address: 0xc8
>  >  [localhost:03892] [ 0] /lib/libpthread.so.0 [0x2ac71ca8bed0]
>  >  [localhost:03892] [ 1]
>  >  /usr/local/mpi/ompi-svn/lib/libmpi.so.0(ompi_dpm_base_dyn_finalize+0xa3)
>  >  [0x2ac71ba7448c]
>  >  [localhost:03892] [ 2] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 
> [0x2ac71b9decdf]
>  >  [localhost:03892] [ 3] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 
> [0x2ac71ba04765]
>  >  [localhost:03892] [ 4]
>  >  /usr/local/mpi/ompi-svn/lib/libmpi.so.0(PMPI_Finalize+0x71)
>  >  [0x2ac71ba365c9]
>  >  [localhost:03892] [ 5] ./spawn1(main+0xaa) [0x400ac2]
>  >  [localhost:03892] [ 6] /lib/libc.so.6(__libc_start_main+0xf4) 
> [0x2ac71ccb7b74]
>  >  [localhost:03892] [ 7] ./spawn1 [0x400989]
>  >  [localhost:03892] *** End of error message ***
>  >  --
>  >  mpirun noticed that process rank 0 with PID 3892 on node localhost
>  >  exited on signal 11 (Segmentation fault).
>  >  --
>  >
>  >  the attachments contain the ompi_info, config.log and program.
>  >
>  >  thanks for some check,
>  >
>  > Joao.
>  >
>
>
> > ___
>  >  users mailing list
>  >  us...@open-mpi.org
>  >  http://www.open-mpi.org/mailman/listinfo.cgi/users
>  >
>  >
>  ___
>  users mailing list
>  us...@open-mpi.org
>  http://www.open-mpi.org/mailman/listinfo.cgi/users
>


[OMPI users] Spawn problem

2008-03-31 Thread Joao Vicente Lima
Hi,
sorry bring this again ... but i hope use spawn in ompi someday :-D

The execution of spawn in this way works fine:
MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 2, MPI_INFO_NULL, 0,
MPI_COMM_SELF, &intercomm, MPI_ERRCODES_IGNORE);

but if this code go to a for I get a problem :
for (i= 0; i < 2; i++)
{
  MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 1,
  MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm[i], MPI_ERRCODES_IGNORE);
}

and the error is:
spawning ...
child!
child!
[localhost:03892] *** Process received signal ***
[localhost:03892] Signal: Segmentation fault (11)
[localhost:03892] Signal code: Address not mapped (1)
[localhost:03892] Failing at address: 0xc8
[localhost:03892] [ 0] /lib/libpthread.so.0 [0x2ac71ca8bed0]
[localhost:03892] [ 1]
/usr/local/mpi/ompi-svn/lib/libmpi.so.0(ompi_dpm_base_dyn_finalize+0xa3)
[0x2ac71ba7448c]
[localhost:03892] [ 2] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2ac71b9decdf]
[localhost:03892] [ 3] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2ac71ba04765]
[localhost:03892] [ 4]
/usr/local/mpi/ompi-svn/lib/libmpi.so.0(PMPI_Finalize+0x71)
[0x2ac71ba365c9]
[localhost:03892] [ 5] ./spawn1(main+0xaa) [0x400ac2]
[localhost:03892] [ 6] /lib/libc.so.6(__libc_start_main+0xf4) [0x2ac71ccb7b74]
[localhost:03892] [ 7] ./spawn1 [0x400989]
[localhost:03892] *** End of error message ***
--
mpirun noticed that process rank 0 with PID 3892 on node localhost
exited on signal 11 (Segmentation fault).
--

the attachments contain the ompi_info, config.log and program.

thanks for some check,
Joao.


config.log.gz
Description: GNU Zip compressed data


ompi_info.txt.gz
Description: GNU Zip compressed data


spawn1.c.gz
Description: GNU Zip compressed data


[OMPI users] MPI_Comm_spawn errors

2008-02-18 Thread Joao Vicente Lima
Hi all,
I'm getting errors with spawn in the situations:

1) spawn1.c - spawning 2 process on localhost, one by one,  the error is:

spawning ...
[localhost:31390] *** Process received signal ***
[localhost:31390] Signal: Segmentation fault (11)
[localhost:31390] Signal code: Address not mapped (1)
[localhost:31390] Failing at address: 0x98
[localhost:31390] [ 0] /lib/libpthread.so.0 [0x2b1d38a17ed0]
[localhost:31390] [ 1]
/usr/local/mpi/openmpi-svn/lib/libmpi.so.0(ompi_comm_dyn_finalize+0xd2)
[0x2b1d37667cb2]
[localhost:31390] [ 2]
/usr/local/mpi/openmpi-svn/lib/libmpi.so.0(ompi_comm_finalize+0x3b)
[0x2b1d3766358b]
[localhost:31390] [ 3]
/usr/local/mpi/openmpi-svn/lib/libmpi.so.0(ompi_mpi_finalize+0x248)
[0x2b1d37679598]
[localhost:31390] [ 4] ./spawn1(main+0xac) [0x400ac4]
[localhost:31390] [ 5] /lib/libc.so.6(__libc_start_main+0xf4) [0x2b1d38c43b74]
[localhost:31390] [ 6] ./spawn1 [0x400989]
[localhost:31390] *** End of error message ***
--
mpirun has exited due to process rank 0 with PID 31390 on
node localhost calling "abort". This will have caused other processes
in the application to be terminated by signals sent by mpirun
(as reported here).
--

With 1 process spawned or with 2 process spawned in one call there is
no output from child.

2) spawn2.c - no response, this init is
 MPI_Init_thread (&argc, &argv, MPI_THREAD_MULTIPLE, &required)

the attachments contains the programs, ompi_info and config.log.

Some suggest ?

thanks a lot.
Joao.


spawn1.c.gz
Description: GNU Zip compressed data


spawn2.c.gz
Description: GNU Zip compressed data


ompi_info.txt.gz
Description: GNU Zip compressed data


config.log.gz
Description: GNU Zip compressed data


[OMPI users] init_thread + spawn error

2007-10-01 Thread Joao Vicente Lima
Hi all!
I'm getting a error on call MPI_Init_thread and MPI_Comm_spawn.
am I mistaking something?
the attachments contains my ompi_info and source ...

thank!
Joao


  char *arg[]= {"spawn1", (char *)0};

  MPI_Init_thread (&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
  MPI_Comm_spawn ("./spawn_slave", arg, 1,
  MPI_INFO_NULL, 0, MPI_COMM_SELF, &slave,
  MPI_ERRCODES_IGNORE);
.

and the error:

opal_mutex_lock(): Resource deadlock avoided
[c8:13335] *** Process received signal ***
[c8:13335] Signal: Aborted (6)
[c8:13335] Signal code:  (-6)
[c8:13335] [ 0] [0xb7fbf440]
[c8:13335] [ 1] /lib/libc.so.6(abort+0x101) [0xb7abd5b1]
[c8:13335] [ 2] /usr/local/openmpi/openmpi-svn/lib/libmpi.so.0 [0xb7e2933c]
[c8:13335] [ 3] /usr/local/openmpi/openmpi-svn/lib/libmpi.so.0 [0xb7e2923a]
[c8:13335] [ 4] /usr/local/openmpi/openmpi-svn/lib/libmpi.so.0 [0xb7e292e3]
[c8:13335] [ 5] /usr/local/openmpi/openmpi-svn/lib/libmpi.so.0 [0xb7e29fa7]
[c8:13335] [ 6] /usr/local/openmpi/openmpi-svn/lib/libmpi.so.0 [0xb7e29eda]
[c8:13335] [ 7] /usr/local/openmpi/openmpi-svn/lib/libmpi.so.0 [0xb7e2adec]
[c8:13335] [ 8] /usr/local/openmpi/openmpi-svn/lib/libmpi.so.0(ompi_proc_unpack+
0x181) [0xb7e2b142]
[c8:13335] [ 9] /usr/local/openmpi/openmpi-svn/lib/libmpi.so.0(ompi_comm_connect
_accept+0x57c) [0xb7e0fb70]
[c8:13335] [10] /usr/local/openmpi/openmpi-svn/lib/libmpi.so.0(PMPI_Comm_spawn+0
x395) [0xb7e5e285]
[c8:13335] [11] ./spawn(main+0x7f) [0x80486ef]
[c8:13335] [12] /lib/libc.so.6(__libc_start_main+0xdc) [0xb7aa7ebc]
[c8:13335] [13] ./spawn [0x80485e1]
[c8:13335] *** End of error message ***
--
mpirun has exited due to process rank 0 with PID 13335 on
node c8 calling "abort". This will have caused other processes
in the application to be terminated by signals sent by mpirun
(as reported here).
--

#include "mpi.h"
#include 

int main (int argc, char **argv)
{
  int provided;
  MPI_Comm slave;
  char *arg[]= {"spawn1", (char *)0};

  MPI_Init_thread (&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
  MPI_Comm_spawn ("./spawn_slave", arg, 1, 
  MPI_INFO_NULL, 0, MPI_COMM_SELF, &slave,
  MPI_ERRCODES_IGNORE);

  MPI_Finalize ();
  return 0;
}
Open MPI: 1.3a1r16236
   Open MPI SVN revision: r16236
Open RTE: 1.3a1r16236
   Open RTE SVN revision: r16236
OPAL: 1.3a1r16236
   OPAL SVN revision: r16236
  Prefix: /usr/local/openmpi/openmpi-svn
 Configured architecture: i686-pc-linux-gnu
  Configure host: corisco
   Configured by: lima
   Configured on: Wed Sep 26 11:37:04 BRT 2007
  Configure host: corisco
Built by: lima
Built on: Wed Sep 26 12:07:13 BRT 2007
  Built host: corisco
  C bindings: yes
C++ bindings: yes
  Fortran77 bindings: yes (all)
  Fortran90 bindings: no
 Fortran90 bindings size: na
  C compiler: gcc
 C compiler absolute: /usr/bin/gcc
C++ compiler: g++
   C++ compiler absolute: /usr/bin/g++
  Fortran77 compiler: g77
  Fortran77 compiler abs: /usr/bin/g77
  Fortran90 compiler: none
  Fortran90 compiler abs: none
 C profiling: yes
   C++ profiling: yes
 Fortran77 profiling: yes
 Fortran90 profiling: no
  C++ exceptions: no
  Thread support: posix (mpi: yes, progress: no)
   Sparse Groups: no
  Internal debug support: yes
 MPI parameter check: runtime
Memory profiling support: yes
Memory debugging support: yes
 libltdl support: yes
   Heterogeneous support: yes
 mpirun default --prefix: no
 MPI I/O support: yes
   MCA backtrace: execinfo (MCA v1.0, API v1.0, Component v1.3)
  MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.3)
   MCA paffinity: linux (MCA v1.0, API v1.1, Component v1.3)
   MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.3)
   MCA timer: linux (MCA v1.0, API v1.0, Component v1.3)
 MCA installdirs: env (MCA v1.0, API v1.0, Component v1.3)
 MCA installdirs: config (MCA v1.0, API v1.0, Component v1.3)
   MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
   MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
MCA coll: basic (MCA v1.0, API v1.0, Component v1.3)
MCA coll: inter (MCA v1.0, API v1.0, Component v1.3)
MCA coll: self (MCA v1.0, API v1.0, Component v1.3)
MCA coll: sm (MCA v1.0, API v1.0, Component v1.3)
MCA coll: tuned (MCA v1.0, API v1.0, Component v1.3)
  MCA io: romio (MCA v1.0, API v1.0, Component v1.3)
   MCA mpool: rdma (MCA v1.0, API v1.0, Component v1.3)
   MCA mpool: sm (MCA v1.0, API v1.0, Component v1.3)