ompi_[circular_buffer_]fifo.h

Eugene Loh Fri, 13 Feb 2009 01:16:18 -0500

Got it, thanks.

Is anyone else looking at that ticket? I'm still a newbie and I suspectsomeone else could figure this problem out a lot faster than I could.So, I'm curious how much I should be looking at this ticket.

If amateurs are allowed to speculate, however, my guess is that thisisn't really a BTL thing. It reminds me of trac ticket 1468 (aka1516). In that case, there was a lot of one-way traffic. We needed away to return frags to the sender. I guess that was solved.

So, the present problem is something different. My guess is thatsenders are overrunning receivers. Could it be that some receiver (likethe root in the MPI_Reduce) ends up with too many in-coming messages.It has to queue up unexpected messages, which slows it down further,which means it has to deal with even more unexpected messages, etc.Those messages have to be placed somewhere, which means memory isallocated, etc.?


Just a theory.  I don't know the PML well enough to judge its soundness.

But if this is the case, it's a PML issue rather than a BTL issue.Maybe there should be some flow control -- particular in ourimplementation of collectives!


Ralph Castain wrote:

The connection is only that, if you are going to modify the sm BTL asyou say, you might at least want to be aware that we have a problemin it so you (a) don't make it worse than it already is, and (b)might keep an eye open for the problem as you are changing things.
On Feb 12, 2009, at 3:58 PM, Eugene Loh wrote:
Sorry, what's the connection? Are we talking abouthttps://svn.open-mpi.org/trac/ompi/ticket/1791 ? Are you simplysaying that if I'm doing some sm BTL work, I should also look at1791? I'm trying to figure out if there's some more specificconnection I'm missing.
Ralph Castain wrote:
You might want to look at ticket #1791 while you are doing this -Brad added some valuable data earlier today.

Re: [OMPI devel] RFC: Eliminate ompi/class/ompi_[circular_buffer_]fifo.h

Reply via email to