[OMPI devel] trac ticket 1944 and pending sends

Eugene Loh Tue, 23 Jun 2009 11:04:15 -0400

The sm BTL used to have two mechanisms for dealing with congestedFIFOs. One was to grow the FIFOs. Another was to queue pending sendslocally (on the sender's side). I think the grow-FIFO mechanism wastypically invoked and the pending-send mechanism used only under extremecircumstances (no more memory).

With the sm makeover of 1.3.2, we dropped the ability to grow FIFOs.The code added complexity and there seemed to be no need to have twomechanisms to deal with congested FIFOs. In ticket 1944, however, wesee that repeated collectives can produce hangs, and this seems to bedue to the pending-send code not adequately dealing with congested FIFOs.

Today, when a process tries to write to a remote FIFO and fails, itqueues the write as a pending send. The only condition under which itretries pending sends is when it gets a fragment back from a remote process.

I think the logic must have been that the FIFO got congested because weissued too many sends. Getting a fragment back indicates that theremote process has made progress digesting those sends. In ticket 1944,we see that a FIFO can also get congested from too many returningfragments. Further, with shared FIFOs, a FIFO could become congesteddue to the activity of a third-party process.

In sum, getting a fragment back from a remote process is a poorindicator that it's time to retry pending sends.

Maybe the real way to know when to retry pending sends is just to checkif there's room on the FIFO.

So, I'll try modifying MCA_BTL_SM_FIFO_WRITE. It'll start by checkingif there are pending sends. If so, it'll retry them before performingthe requested write. This should also help preserve ordering a littlebetter. I'm guessing this will not hurt our message latency in anymeaningful way, but I'll check this out.


Meanwhile, I wanted to check in with y'all for any guidance you might have.

[OMPI devel] trac ticket 1944 and pending sends

Reply via email to