> > It seems that the calls to collective communication are not
> > returning for some MPI processes, when the number of processes is
> > greater or equal to 5. It's reproduceable, on two different
> > architectures, with two different versions of OpenMPI (1.3.2 and
> > 1.3.3). It was working correctly with OpenMPI version 1.2.7.
> 
> Does it work if you turn off the shared memory transport layer; that is,
> 
> mpirun -n 6 -mca btl ^sm ./testmpi

Yes it does, on both my configurations (AMD and Intel processor).
So it seems that the shared memory synchronization process is
broken.

Could be a system bug, I don't know what library OpenMPI uses
(is it IPC ?). Both my systems are Linux 2.6.31, the AMD is Ubuntu,
and the Intel is an ARCH-linux.

--Vincent

Reply via email to