On Thu, 2008-02-21 at 21:31 +0200, Gleb Natapov wrote: > On Thu, Feb 21, 2008 at 11:10:24AM -0800, Ralph Campbell wrote: > > > > To further complicate things, this race condition is never seen _if_ > > > the > > > > application uses the same QP to advertise (send a credit allowing > > > the > > > > peer to SEND) the RECV buffer availability. So if the app posts a > > > SEND > > > > after the RECV is posted and that SEND allows the peer access to > > > the > > > > RECV buffer, then everything is ok. This is due to the fact that > > > the > > > > FW/HW will process the SEND only after processing the RECV. If the > > > app > > > > uses a different QP to post the SEND advertising the RECV, then the > > > race > > > > condition exists allowing the peer to SEND into that RECV buffer > > > before > > > > the HW makes it ready. > > > > Well, there is no guarantee that the HCA processes the post_recv() > > before the post_send() even on the same QP. Send and receive are > > unordered with respect to each other. The fact that it works is > > an HCA specific implementation artifact. > So there is no way to implement SW flow control over Infiniband? How > is that IB spec has SW flow control specification for SDP in it then? > > > > > > > This all assumes a specific design of rdma hw. Maybe nobody else > > > has > > > > this issue? > > > > > > > > Maybe I'm not making sense. :) > > > > > > I think your descriptions here match what Ralph found RNR in IPoIB-CM. > > > > > > Ralph, > > > > > > Does this make sense? > > > > > > Thanks > > > Shirley > > > > I think you are making sense. There is an indeterminate race > > between post_recv() returning to the application and when > > a packet being received by the HCA might be able to use > > that buffer. There are no ordering guarantees > > between messages sent on one QP and another so the application > > can't easily use a different QP to advertise posted buffers (credits). > If after post_recv() returns it is guarantied that receive buffers are > available to HW we don't need ordering guaranties between QPs to > successfully implement SW flow control.
Right. I was just pointing out that Steve is correct in his assumption that there might be races between post_recv() returning and the HCA being able to use that buffer to receive a packet that was already in flight before the post_recv(). > > That is why the IB RC protocol does this for you in band if the RC QP > > is using a dedicated receive queue but not a shared receive queue. > What do you mean by that? RNR works for both RC and SRQ QPs. Right. I was referring to the credit returned in the ACK header which allows the remote RC QP endpoint to send a message after a post_recv(). There is no such message level flow control if the RC QP is using a SRQ. > -- > Gleb. _______________________________________________ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general