Re: [OMPI users] MPI hangs on multiple nodes

2011-09-25 Thread Ole Nielsen
K. Gutierrez) > > > ------ > > Message: 1 > Date: Mon, 19 Sep 2011 13:13:08 -0400 > From: Gus Correa <g...@ldeo.columbia.edu> > Subject: Re: [OMPI users] RE : MPI hangs on multiple nodes > To: Open

Re: [OMPI users] MPI hangs on multiple nodes

2011-09-20 Thread Gus Correa
. But that's not what hangs your program. Gus Correa Message: 11 Date: Mon, 19 Sep 2011 10:37:02 -0400 From: Gus Correa <g...@ldeo.columbia.edu <mailto:g...@ldeo.columbia.edu>> Subject: Re: [OMPI users] RE : MPI hangs on multiple nodes To: Open MPI Users <us...@open-mpi.org <

Re: [OMPI users] MPI hangs on multiple nodes

2011-09-20 Thread Rolf vandeVaart
>> 1: After a reboot of two nodes I ran again, and the inter-node freeze didn't >happen until the third iteration. I take that to mean that the basic >communication works, but that something is saturating. Is there some notion >of buffer size somewhere in the MPI system that could explain this? >

Re: [OMPI users] MPI hangs on multiple nodes

2011-09-20 Thread Jeff Squyres
On Sep 19, 2011, at 10:23 PM, Ole Nielsen wrote: > Hi all - and sorry for the multiple postings, but I have more information. +1 on Eugene's comments. The test program looks fine to me. FWIW, you don't need -lmpi to compile your program; OMPI's wrapper compiler allows you to just: mpicc

[OMPI users] MPI hangs on multiple nodes

2011-09-19 Thread Ole Nielsen
Hi all - and sorry for the multiple postings, but I have more information. 1: After a reboot of two nodes I ran again, and the inter-node freeze didn't happen until the third iteration. I take that to mean that the basic communication works, but that something is saturating. Is there some notion

[OMPI users] MPI hangs on multiple nodes

2011-09-19 Thread Ole Nielsen
works fine on other installations and indeed when run on one the cores of one Node. Message: 11 List-Post: users@lists.open-mpi.org Date: Mon, 19 Sep 2011 10:37:02 -0400 From: Gus Correa <g...@ldeo.columbia.edu> Subject: Re: [OMPI users] RE : MPI hangs on multiple nodes To: Open MPI Use

Re: [OMPI users] MPI hangs on multiple nodes

2011-09-19 Thread devendra rai
./mpi_test So, maybe this helps you. Best, Devendra Rai From: Ole Nielsen <ole.moller.niel...@gmail.com> To: us...@open-mpi.org Sent: Monday, 19 September 2011, 10:59 Subject: [OMPI users] MPI hangs on multiple nodes The test program is available here

[OMPI users] MPI hangs on multiple nodes

2011-09-19 Thread Ole Nielsen
The test program is available here: http://code.google.com/p/pypar/source/browse/source/mpi_test.c Hopefully, someone can help us troubleshoot why communications stop when multiple nodes are involved and CPU usage goes to 100% for as long as we leave the program running. Many thanks Ole Nielsen

Re: [OMPI users] MPI hangs on multiple nodes

2011-09-19 Thread Ole Nielsen
Further to the posting below, I can report that the test program (attached - this time correctly) is chewing up CPU time on both compute nodes for as long as I care to let it continue. It would appear that MPI_Receive which is the next command after the print statements in the test program. Has

[OMPI users] MPI hangs on multiple nodes

2011-09-19 Thread Ole Nielsen
Hi all We have been using OpenMPI for many years with Ubuntu on our 20-node cluster. Each node has 2 quad cores, so we usually run up to 8 processes on each node up to a maximum of 160 processes. However, we just upgraded the cluster to Ubuntu 11.04 with Open MPI 1.4.3 and and have come across a