Re: [OMPI users] collective communications broken on more than 4 cores

2009-10-29 Thread John R. Cary
This also appears to fix a bug I had reported that did not involve 
collective calls.

The code is appended.  When run on 64 bit architecture with

iter.cary$ gcc --version
gcc (GCC) 4.4.0 20090506 (Red Hat 4.4.0-4)
Copyright (C) 2009 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

iter.cary$ uname -a
Linux iter.txcorp.com 2.6.29.4-167.fc11.x86_64 #1 SMP Wed May 27 
17:27:08 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux

iter.cary$ mpicc -show
gcc -I/usr/local/openmpi-1.3.2-nodlopen/include -pthread 
-L/usr/local/torque-2.4.0b1/lib -Wl,--rpath 
-Wl,/usr/local/torque-2.4.0b1/lib 
-Wl,-rpath,/usr/local/openmpi-1.3.2-nodlopen/lib 
-L/usr/local/openmpi-1.3.2-nodlopen/lib -lmpi -lopen-rte -lopen-pal 
-ltorque -ldl -lnsl -lutil -lm


as

 mpirun -n 3 ompi1.3.3-bug

it hangs after some 100-500 iterations.  When run

 mpirun -n 3 -mca btl ^sm ./ompi1.3.3-bug

or

 mpirun -n 3 -mca btl_sm_num_fifos 5 ./ompi1.3.3-bug

it seems to work fine.

Valgrind points to some issues:

==29641== Syscall param sched_setaffinity(mask) points to unaddressable 
byte(s)

==29641==at 0x30B5EDAA79: syscall (in /lib64/libc-2.10.1.so)
==29641==by 0x54B5098: opal_paffinity_linux_plpa_api_probe_init (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-pal.so.0.0.0)
==29641==by 0x54B7394: opal_paffinity_linux_plpa_init (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-pal.so.0.0.0)
==29641==by 0x54B5D39: 
opal_paffinity_linux_plpa_have_topology_information (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-pal.so.0.0.0)
==29641==by 0x54B4F3F: linux_module_init (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-pal.so.0.0.0)
==29641==by 0x54B2D03: opal_paffinity_base_select (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-pal.so.0.0.0)
==29641==by 0x548C3D3: opal_init (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-pal.so.0.0.0)
==29641==by 0x520F09C: orte_init (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-rte.so.0.0.0)
==29641==by 0x4E67D26: ompi_mpi_init (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)
==29641==by 0x4E87195: PMPI_Init (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)

==29641==by 0x408011: main (in /home/research/cary/ompi1.3.3-bug)
==29641==  Address 0x0 is not stack'd, malloc'd or (recently) free'd

==29641== Warning: client syscall munmap tried to modify addresses 
0x-0xffe
==29640== Warning: client syscall munmap tried to modify addresses 
0x-0xffe
==29639== Warning: client syscall munmap tried to modify addresses 
0x-0xffe

==29641==
==29641== Syscall param writev(vector[...]) points to uninitialised byte(s)
==29641==at 0x30B5ED67AB: writev (in /lib64/libc-2.10.1.so)
==29641==by 0x5241686: mca_oob_tcp_msg_send_handler (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-rte.so.0.0.0)
==29641==by 0x52426BC: mca_oob_tcp_peer_send (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-rte.so.0.0.0)
==29641==by 0x52450EC: mca_oob_tcp_send_nb (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-rte.so.0.0.0)
==29641==by 0x5255B33: orte_rml_oob_send_buffer (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-rte.so.0.0.0)
==29641==by 0x5230682: allgather (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-rte.so.0.0.0)
==29641==by 0x5230179: modex (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-rte.so.0.0.0)
==29641==by 0x4E68199: ompi_mpi_init (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)
==29641==by 0x4E87195: PMPI_Init (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)

==29641==by 0x408011: main (in /home/research/cary/ompi1.3.3-bug)
==29641==  Address 0x5c89aef is 87 bytes inside a block of size 128 alloc'd
==29641==at 0x4A0763E: malloc (vg_replace_malloc.c:207)
==29641==by 0x548D76A: opal_dss_buffer_extend (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-pal.so.0.0.0)
==29641==by 0x548E780: opal_dss_pack (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-pal.so.0.0.0)
==29641==by 0x5230620: allgather (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-rte.so.0.0.0)
==29641==by 0x5230179: modex (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-rte.so.0.0.0)
==29641==by 0x4E68199: ompi_mpi_init (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)
==29641==by 0x4E87195: PMPI_Init (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)

==29641==by 0x408011: main (in /home/research/cary/ompi1.3.3-bug)


==29640== Conditional jump or move depends on uninitialised value(s)
==29640==at 0x4EF26A4: mca_mpool_sm_alloc (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)
==29640==by 0x4E4BEEF: ompi_free_list_grow (in 
/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)
==29640==by 0x4EA8793: mca_btl_sm_add_procs (in 

Re: [OMPI users] collective communications broken on more than 4 cores

2009-10-29 Thread Vincent Loechner

> >>> It seems that the calls to collective communication are not
> >>> returning for some MPI processes, when the number of processes is
> >>> greater or equal to 5. It's reproduceable, on two different
> >>> architectures, with two different versions of OpenMPI (1.3.2 and
> >>> 1.3.3). It was working correctly with OpenMPI version 1.2.7.
> >>
> >> Does it work if you turn off the shared memory transport layer;  
> >> that is,
> >>
> >> mpirun -n 6 -mca btl ^sm ./testmpi
> >
> > Yes it does, on both my configurations (AMD and Intel processor).
> > So it seems that the shared memory synchronization process is
> > broken.
> 
> Presumably that is this bug:
> https://svn.open-mpi.org/trac/ompi/ticket/2043

Yes it is.

> I also found by trial and error that increasing the number of fifos, eg
> -mca btl_sm_num_fifos 5
> on a 6-processor job, apparently worked around the problem.
> But yes, something seems broken in OpenMP shared memory transport with  
> gcc 4.4.x.

Yes, same for me: -mca btl_sm_num_fifos 5 worked.
Thanks for your answer Jonathan.

If I may help the developpers in any way to track this bug get into
contact with me.

--Vincent


Re: [OMPI users] collective communications broken on more than 4 cores

2009-10-29 Thread Jonathan Dursi

On 2009-10-29, at 10:21AM, Vincent Loechner wrote:




It seems that the calls to collective communication are not
returning for some MPI processes, when the number of processes is
greater or equal to 5. It's reproduceable, on two different
architectures, with two different versions of OpenMPI (1.3.2 and
1.3.3). It was working correctly with OpenMPI version 1.2.7.


Does it work if you turn off the shared memory transport layer;  
that is,


mpirun -n 6 -mca btl ^sm ./testmpi


Yes it does, on both my configurations (AMD and Intel processor).
So it seems that the shared memory synchronization process is
broken.


Presumably that is this bug:
https://svn.open-mpi.org/trac/ompi/ticket/2043

I also found by trial and error that increasing the number of fifos, eg
-mca btl_sm_num_fifos 5
on a 6-processor job, apparently worked around the problem.
But yes, something seems broken in OpenMP shared memory transport with  
gcc 4.4.x.


   Jonathan
--
Jonathan Dursi 






Re: [OMPI users] collective communications broken on more than 4 cores

2009-10-29 Thread Vincent Loechner

> > It seems that the calls to collective communication are not
> > returning for some MPI processes, when the number of processes is
> > greater or equal to 5. It's reproduceable, on two different
> > architectures, with two different versions of OpenMPI (1.3.2 and
> > 1.3.3). It was working correctly with OpenMPI version 1.2.7.
> 
> Does it work if you turn off the shared memory transport layer; that is,
> 
> mpirun -n 6 -mca btl ^sm ./testmpi

Yes it does, on both my configurations (AMD and Intel processor).
So it seems that the shared memory synchronization process is
broken.

Could be a system bug, I don't know what library OpenMPI uses
(is it IPC ?). Both my systems are Linux 2.6.31, the AMD is Ubuntu,
and the Intel is an ARCH-linux.

--Vincent


Re: [OMPI users] collective communications broken on more than 4 cores

2009-10-29 Thread Jonathan Dursi


On 2009-10-29, at 9:57AM, Vincent Loechner wrote:

[...]

It seems that the calls to collective communication are not
returning for some MPI processes, when the number of processes is
greater or equal to 5. It's reproduceable, on two different
architectures, with two different versions of OpenMPI (1.3.2 and
1.3.3). It was working correctly with OpenMPI version 1.2.7.



[...]
GCC version :
$ mpicc --version
gcc (Ubuntu 4.4.1-4ubuntu7) 4.4.1



Does it work if you turn off the shared memory transport layer; that is,

mpirun -n 6 -mca btl ^sm ./testmpi

?

   - Jonathan
--
Jonathan Dursi 






[OMPI users] collective communications broken on more than 4 cores

2009-10-29 Thread Vincent Loechner

Hello to the list,

I came to a problem running a simple program with collective
communications, on a 6-core processors (6 local MPI processes).
It seems that the calls to collective communication are not
returning for some MPI processes, when the number of processes is
greater or equal to 5. It's reproduceable, on two different
architectures, with two different versions of OpenMPI (1.3.2 and
1.3.3). It was working correctly with OpenMPI version 1.2.7.


I just wrote a very simple test, making 1000 calls to MPI_Barrier().
Running on an istanbul processor (6-core AMD Opteron) :
$ uname -a
Linux istanbool 2.6.31-14-generic #46-Ubuntu SMP Tue Oct 13 16:47:28 UTC 2009 
x86_64 GNU/Linux
with a OpenMPI ubuntu package, version 1.3.2.
Running with 5 or 6 MPI processes, it just hangs after a random
number of iterations, ranging from 3 to 600, and sometimes it
finishes correctly (about 1 time out of 8). Just ran :
'mpirun -n 6 ./testmpi'
Same behavior with more MPI processes.

I tried the '--mca coll_basic_priority 50' option, the program has
more chance to finish -about one time out of 2, but also deadlocks
the other time after a random number of iterations.

Without setting the coll_basic_priority option, I ran a debugger, and
found out that some processes are blocked in:
#0  0x7f858f272f7a in opal_progress () from /usr/lib/libopen-pal.so.0
#1  0x7f858f7524f5 in ?? () from /usr/lib/libmpi.so.0
#2  0x7f8589e74c5a in ?? ()
   from /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so
#3  0x7f8589e7cefa in ?? ()
   from /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so
#4  0x7f858f767b32 in PMPI_Barrier () from /usr/lib/libmpi.so.0
#5  0x00400c10 in main (argc=1, argv=0x7fff9d59acf8) at testmpi.c:24

and the others in:
#0  0x7f05799e933a in ?? () from /usr/lib/openmpi/lib/openmpi/mca_btl_sm.so
#1  0x7f057dd22fba in opal_progress () from /usr/lib/libopen-pal.so.0
#2  0x7f057e2024f5 in ?? () from /usr/lib/libmpi.so.0
#3  0x7f0578924c5a in ?? ()
   from /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so
#4  0x7f057892cefa in ?? ()
   from /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so
#5  0x7f057e217b32 in PMPI_Barrier () from /usr/lib/libmpi.so.0
#6  0x00400c10 in main (argc=1, argv=0x7fff1b55b4a8) at testmpi.c:24


Seems that other collective communications are broken, my original
program was blocked after a call to MPI_Allreduce.

I also made tests on a 4-core Intel core i7, openMPI version 1.3.3,
with exatly the same problem: calls to collective communication not
returning for some MPI processes when the number of processes is
greater or equal to 5.

Below, some technical details on my configuration, input file,
example outputs. The output of ompi_info --all is attached to this
mail.

Best regards,
-- 
Vincent LOECHNER |0---0  |  ICPS, LSIIT (UMR 7005),
 PhD |   /|  /|  |  Equipe INRIA CAMUS,
 Phone: +33 (0)368 85 45 37  |  0---0 |  |  Université de Strasbourg
 Fax  : +33 (0)368 85 45 47  |  | 0-|-0  |  Pôle API, Bd. Sébastien Brant
 |  |/  |/   |  F-67412 ILLKIRCH Cedex
 loech...@unistra.fr |  0---0|  http://icps.u-strasbg.fr
--


Input program:
// testmpi.c ---
#include 
#include 
#define MCW MPI_COMM_WORLD

int main( int argc, char **argv )
{
int n, r;   /* number of processes, process rank */
int i;

MPI_Init( ,  );
MPI_Comm_size( MCW,  );
MPI_Comm_rank( MCW,  );

for( i=0 ; i<1000 ; i++ )
{
printf( "(%d) %d\n", r, i ); fflush(stdout);
MPI_Barrier( MCW );
}

MPI_Finalize();
return( 0 );
}
// testmpi.c ---

Compilation line:
$ mpicc -O2 -Wall -g testmpi.c -o testmpi

GCC version :
$ mpicc --version
gcc (Ubuntu 4.4.1-4ubuntu7) 4.4.1

OpenMPI version : 1.3.2
$ ompi_info -v ompi full
 Package: Open MPI buildd@crested Distribution
Open MPI: 1.3.2
   Open MPI SVN revision: r21054
   Open MPI release date: Apr 21, 2009
Open RTE: 1.3.2
   Open RTE SVN revision: r21054
   Open RTE release date: Apr 21, 2009
OPAL: 1.3.2
   OPAL SVN revision: r21054
   OPAL release date: Apr 21, 2009
Ident string: 1.3.2

--- example run (I hit ^C after a while)
$ mpirun  -n 6 ./testmpi
(0) 0
(0) 1
(0) 2
(0) 3
(1) 0
(1) 1
(1) 2
(2) 0
(2) 1
(2) 2
(2) 3
(3) 0
(3) 1
(3) 2
(4) 0
(4) 1
(4) 2
(4) 3
(5) 0
(5) 1
(5) 2
(5) 3
^Cmpirun: killing job...

--
mpirun noticed that process rank 0 with PID 10466 on node istanbool exited on 
signal 0 (Unknown signal 0).