Re: [OMPI users] problem compiling openmpi

2013-06-27 Thread marco atzeri
Il 6/27/2013 9:23 PM, rmjuberias ha scritto: hi i am trying to compile openmpi and when I make the "make all install" I have an error that I cant figure out. Any feedback would be appreciated. Thanks! openmpi-1.2.6 ? Why not at least a 1.6.x series ?

Re: [OMPI users] EXTERNAL: Re: Application hangs on mpi_waitall

2013-06-27 Thread George Bosilca
At this point I'm running out of ideas … Can I have a simple reproducer of this issue? If possible send me the code and I'll try to dig a little more to see what the problem is. George. On Jun 27, 2013, at 23:02 , "Blosch, Edwin L" wrote: > I tried excluding

Re: [OMPI users] EXTERNAL: Re: Application hangs on mpi_waitall

2013-06-27 Thread Blosch, Edwin L
I tried excluding openib but it did not succeed. It actually made about the same progress as previously using the openib interface before hanging (I mean, my 30 second timeout period expired). I'm more than happy to try out any other suggestions... From: users-boun...@open-mpi.org

[OMPI users] openmpi install problem ...

2013-06-27 Thread rmjuberias
I figured out a way about the problem I reported in my previous email. So never mind! :o)

Re: [OMPI users] EXTERNAL: Re: Application hangs on mpi_waitall

2013-06-27 Thread George Bosilca
This seems to highlight a possible bug in the MPI implementation. As I suggested earlier, the credit management of the OpenIB might be unsafe. To confirm this one last test to run. Let's prevent the OpenIB support from being used during the run (thus Open MPI will fall back to TCP). I suppose

[OMPI users] problem compiling openmpi

2013-06-27 Thread rmjuberias
hi i am trying to compile openmpi and when I make the "make all install" I have an error that I cant figure out. Any feedback would be appreciated. Thanks! ompi-output.tar.bz2 Description: BZip2 compressed data

Re: [OMPI users] Application hangs on mpi_waitall

2013-06-27 Thread Blosch, Edwin L
Also, just to be clear, that attached listing is ordered by data in the first column and doesn’t reflect the call sequence. In actual implementation, all the messages labeled “mpi-recv” are mpi_irecv and are all posted before any of the mpi_isends are posted. From: users-boun...@open-mpi.org

Re: [OMPI users] EXTERNAL: Re: Application hangs on mpi_waitall

2013-06-27 Thread Blosch, Edwin L
The debug version also hung, roughly the same amount of progress in the computations (although of course it took much longer to make that progress in comparison to the optimized version). On the bright side, the idea of putting an mpi_barrier after the irecvs and before the isends appears to

Re: [OMPI users] Application hangs on mpi_waitall

2013-06-27 Thread Blosch, Edwin L
Attached is the message list for rank 0 for the communication step that is failing. There are about 160 isends and irecvs. The ‘message size’ is actually a number of cells. On some steps only one 8-byte word per cell is communicated, at another step we exchange 7 words, and another step we

Re: [OMPI users] Application hangs on mpi_waitall

2013-06-27 Thread George Bosilca
If I understand correctly the communication parroter is a one-to-all type of communication isn't it (from your server to your clients)? In this case this might be a credit management issue, where the master is running out of ack buffers and the clients can't acknowledge the retrieval of the

Re: [OMPI users] Application hangs on mpi_waitall

2013-06-27 Thread Rolf vandeVaart
Ed, how large are the messages that you are sending and receiving? Rolf From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Ed Blosch Sent: Thursday, June 27, 2013 9:01 AM To: us...@open-mpi.org Subject: Re: [OMPI users] Application hangs on mpi_waitall It ran a

Re: [OMPI users] Application hangs on mpi_waitall

2013-06-27 Thread Ed Blosch
It ran a bit longer but still deadlocked.  All matching sends are posted 1:1with posted recvs so it is a delivery issue of some kind.  I'm running a debug compiled version tonight to see what that might turn up.  I may try to rewrite with blocking sends and see if that works.  I can also try

Re: [OMPI users] error: unknown type name 'ompi_jobid_t'

2013-06-27 Thread Jeff Squyres (jsquyres)
Jeff -- Should be fixed in 1.7.2, which was released yesterday. On Jun 26, 2013, at 3:06 AM, Ralph Castain wrote: > Sorry about that - it has been fixed in the upcoming 1.7.2, which should be > released in the immediate future. For now, you can grab the 1.7.2 tarball >