Re: [OMPI users] collective communications broken on more than 4 cores

John R. Cary Thu, 29 Oct 2009 12:22:50 -0400

This also appears to fix a bug I had reported that did not involvecollective calls.

The code is appended.  When run on 64 bit architecture with


iter.cary$ gcc --version
gcc (GCC) 4.4.0 20090506 (Red Hat 4.4.0-4)
Copyright (C) 2009 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

iter.cary$ uname -a

Linux iter.txcorp.com 2.6.29.4-167.fc11.x86_64 #1 SMP Wed May 2717:27:08 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux

iter.cary$ mpicc -show

gcc -I/usr/local/openmpi-1.3.2-nodlopen/include -pthread-L/usr/local/torque-2.4.0b1/lib -Wl,--rpath-Wl,/usr/local/torque-2.4.0b1/lib-Wl,-rpath,/usr/local/openmpi-1.3.2-nodlopen/lib-L/usr/local/openmpi-1.3.2-nodlopen/lib -lmpi -lopen-rte -lopen-pal-ltorque -ldl -lnsl -lutil -lm


as

 mpirun -n 3 ompi1.3.3-bug

it hangs after some 100-500 iterations.  When run

 mpirun -n 3 -mca btl ^sm ./ompi1.3.3-bug

or

 mpirun -n 3 -mca btl_sm_num_fifos 5 ./ompi1.3.3-bug

it seems to work fine.

Valgrind points to some issues:

==29641== Syscall param sched_setaffinity(mask) points to unaddressablebyte(s)

==29641==    at 0x30B5EDAA79: syscall (in /lib64/libc-2.10.1.so)

==29641== by 0x54B5098: opal_paffinity_linux_plpa_api_probe_init (in/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-pal.so.0.0.0)==29641== by 0x54B7394: opal_paffinity_linux_plpa_init (in/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-pal.so.0.0.0)==29641== by 0x54B5D39:opal_paffinity_linux_plpa_have_topology_information (in/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-pal.so.0.0.0)==29641== by 0x54B4F3F: linux_module_init (in/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-pal.so.0.0.0)==29641== by 0x54B2D03: opal_paffinity_base_select (in/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-pal.so.0.0.0)==29641== by 0x548C3D3: opal_init (in/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-pal.so.0.0.0)==29641== by 0x520F09C: orte_init (in/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-rte.so.0.0.0)==29641== by 0x4E67D26: ompi_mpi_init (in/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)==29641== by 0x4E87195: PMPI_Init (in/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)

==29641==    by 0x408011: main (in /home/research/cary/ompi1.3.3-bug)
==29641==  Address 0x0 is not stack'd, malloc'd or (recently) free'd

==29641== Warning: client syscall munmap tried to modify addresses0xffffffffffffffff-0xffe==29640== Warning: client syscall munmap tried to modify addresses0xffffffffffffffff-0xffe==29639== Warning: client syscall munmap tried to modify addresses0xffffffffffffffff-0xffe

==29641==
==29641== Syscall param writev(vector[...]) points to uninitialised byte(s)
==29641==    at 0x30B5ED67AB: writev (in /lib64/libc-2.10.1.so)

==29641== by 0x5241686: mca_oob_tcp_msg_send_handler (in/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-rte.so.0.0.0)==29641== by 0x52426BC: mca_oob_tcp_peer_send (in/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-rte.so.0.0.0)==29641== by 0x52450EC: mca_oob_tcp_send_nb (in/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-rte.so.0.0.0)==29641== by 0x5255B33: orte_rml_oob_send_buffer (in/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-rte.so.0.0.0)==29641== by 0x5230682: allgather (in/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-rte.so.0.0.0)==29641== by 0x5230179: modex (in/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-rte.so.0.0.0)==29641== by 0x4E68199: ompi_mpi_init (in/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)==29641== by 0x4E87195: PMPI_Init (in/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)

==29641==    by 0x408011: main (in /home/research/cary/ompi1.3.3-bug)
==29641==  Address 0x5c89aef is 87 bytes inside a block of size 128 alloc'd
==29641==    at 0x4A0763E: malloc (vg_replace_malloc.c:207)

==29641== by 0x548D76A: opal_dss_buffer_extend (in/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-pal.so.0.0.0)==29641== by 0x548E780: opal_dss_pack (in/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-pal.so.0.0.0)==29641== by 0x5230620: allgather (in/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-rte.so.0.0.0)==29641== by 0x5230179: modex (in/usr/local/openmpi-1.3.2-nodlopen/lib/libopen-rte.so.0.0.0)==29641== by 0x4E68199: ompi_mpi_init (in/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)==29641== by 0x4E87195: PMPI_Init (in/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)

==29641==    by 0x408011: main (in /home/research/cary/ompi1.3.3-bug)


==29640== Conditional jump or move depends on uninitialised value(s)

==29640== at 0x4EF26A4: mca_mpool_sm_alloc (in/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)==29640== by 0x4E4BEEF: ompi_free_list_grow (in/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)==29640== by 0x4EA8793: mca_btl_sm_add_procs (in/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)==29640== by 0x4E9E6E9: mca_bml_r2_add_procs (in/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)==29640== by 0x4F0B564: mca_pml_ob1_add_procs (in/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)==29640== by 0x4E68288: ompi_mpi_init (in/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)==29640== by 0x4E87195: PMPI_Init (in/usr/local/openmpi-1.3.2-nodlopen/lib/libmpi.so.0.0.0)

==29640==    by 0x408011: main (in /home/research/cary/ompi1.3.3-bug)


....John Cary








Vincent Loechner wrote:

It seems that the calls to collective communication are not
returning for some MPI processes, when the number of processes is
greater or equal to 5. It's reproduceable, on two different
architectures, with two different versions of OpenMPI (1.3.2 and
1.3.3). It was working correctly with OpenMPI version 1.2.7.

Does it work if you turn off the shared memory transport layer;that is,


mpirun -n 6 -mca btl ^sm ./testmpi

Yes it does, on both my configurations (AMD and Intel processor).
So it seems that the shared memory synchronization process is
broken.

Presumably that is this bug:
https://svn.open-mpi.org/trac/ompi/ticket/2043


Yes it is.

I also found by trial and error that increasing the number of fifos, eg
-mca btl_sm_num_fifos 5
on a 6-processor job, apparently worked around the problem.
But yes, something seems broken in OpenMP shared memory transport withgcc 4.4.x.


Yes, same for me: -mca btl_sm_num_fifos 5 worked.
Thanks for your answer Jonathan.

If I may help the developpers in any way to track this bug get into
contact with me.


iter.cary$ cat ompi1.3.3-bug.cxx
/**
* A simple test program to demonstrate a problem in OpenMPI 1.3
*/

// mpi includes
#include <mpi.h>

// std includes
#include <iostream>
#include <vector>

// useful hashdefine
#define ARRAY_SIZE 250

/**
* Main driver
*/
int main(int argc, char** argv) {
// Initialize MPI
 MPI_Init(&argc, &argv);

 int rk, sz;
 MPI_Comm_rank(MPI_COMM_WORLD, &rk);
 MPI_Comm_size(MPI_COMM_WORLD, &sz);

// Create some data to pass around
 std::vector<double> d(ARRAY_SIZE);

// Initialize to some values if we aren't rank 0
 if ( rk )
   for ( unsigned i = 0; i < ARRAY_SIZE; ++i )
     d[i] = 2*i + 1;

// Loop until this breaks
 unsigned t = 0;
 while ( 1 ) {
   MPI_Status s;
   if ( rk )
     MPI_Send( &d[0], d.size(), MPI_DOUBLE, 0, 3, MPI_COMM_WORLD );
   else
     for ( int i = 1; i < sz; ++i )
       MPI_Recv( &d[0], d.size(), MPI_DOUBLE, i, 3, MPI_COMM_WORLD, &s );
   MPI_Barrier(MPI_COMM_WORLD);
   std::cout << "Transmission " << ++t << " completed." << std::endl;
 }

// Finalize MPI
 MPI_Finalize();
}

--
Tech-X Corp., 5621 Arapahoe Ave, Suite A, Boulder CO 80303
c...@txcorp.com, p 303-448-0727, f 303-448-7756, NEW CELL 303-881-8572

Re: [OMPI users] collective communications broken on more than 4 cores

Reply via email to