Re: [OMPI devel] Hang in collectives involving shared memory

Sylvain Jeaugey Fri, 12 Jun 2009 11:33:55 -0400

Hi Ralph,

I managed to have a deadlock after a whole night, but not the same youhave : after a quick analysis, process 0 seems to be blocked in the veryfirst send through shared memory. Still maybe a bug, but not the same asyours IMO.

I also figured out that libnuma support was not in my library, so Irebuilt the lib and this doesn't seem to change anything : same executionspeed, same memory footprint, and of course same the-bug-does-not-appear:-(.

So, no luck so far in reproducing your problem. I guess you're the onlyone to be able to progress on this (since you seem to have a realreproducer).


Sylvain

On Wed, 10 Jun 2009, Sylvain Jeaugey wrote:

Hum, very glad that padb works with Open MPI, I couldn't live without it. Inmy opinion, the best debug tool for parallel applications, and moreimportantly, the only one that scales.
About the issue, I couldn't reproduce it on my platform (tried 2 nodes with 2to 8 processes each, nodes are twin 2.93 GHz Nehalem, IB is Mellanox QDR).
So my feeling about that is that is may be very hardware related. Especiallyif you use the hierarch component, some transactions will be done throughRDMA on one side and read directly through shared memory on the other side,which can, depending on the hardware, produce very different timings andbugs. Did you try with a different collective component (i.e. not hierarch) ?Or with another interconnect ? [Yes, of course, if it is a race condition, wemight well avoid the bug because timings will be different, but that's stillinformation]
Perhaps all what I'm saying makes no sense or you already thought about this,anyway, if you want me to try different things, just let me know.
Sylvain

On Wed, 10 Jun 2009, Ralph Castain wrote:
Hi Ashley
Thanks! I would definitely be interested and will look at the tool.Meantime, I have filed a bunch of data on this inticket #1944, so perhaps you might take a glance at that and offer somethoughts?
https://svn.open-mpi.org/trac/ompi/ticket/1944

Will be back after I look at the tool.

Thanks again
Ralph
On Wed, Jun 10, 2009 at 8:51 AM, Ashley Pittman <ash...@pittman.co.uk>wrote:
      Ralph,

      If I may say this is exactly the type of problem the tool I have been
      working on recently aims to help with and I'd be happy to help you
      through it.

      Firstly I'd say of the three collectives you mention, MPI_Allgather,
MPI_Reduce and MPI_Bcast one exhibit a many-to-many, one amany-to-one
      and the last a many-to-one communication pattern.  The scenario of a
root process falling behind and getting swamped in comms is aplausibleone for MPI_Reduce only but doesn't hold water with the other two. You
      also don't mention if the loop is over a single collective or if you
      have loop calling a number of different collectives each iteration.
padb, the tool I've been working on has the ability to look atparalleljobs and report on the state of collective comms and should helpnarrow
      you down on erroneous processes and those simply blocked waiting for
comms. I'd recommend using it to look at maybe four or fiveinstanceswhere the application has hung and look for any common featuresbetween
      them.
Let me know if you are willing to try this route and I'll talk, thecodeis downloadable from http://padb.pittman.org.uk and if you want thefullcollective functionality you'll need to patch openmp with the patchfrom
      http://padb.pittman.org.uk/extensions.html

      Ashley,

      --

      Ashley Pittman, Bath, UK.

      Padb - A parallel job inspection tool for cluster computing
      http://padb.pittman.org.uk

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] Hang in collectives involving shared memory

Reply via email to