Quoting r. Sean Hefty <[EMAIL PROTECTED]>: > Subject: RE: CMA backlog > > I think that there are some issues that would need to be worked out, but in > general I'm in favor of trying to do something here. > > >Currently, this is not something that can be implemented by ULP on top of > >CMA, because returning error from REQ will result in reject rather than REQ > >drop. > > A generic ULP could handle this by making use of the private data, and > retrying > requests after a REJ with insufficient resources. > > >CMA already has backlog parameter in listen but it is ignored as far as I can > >see. I propose extending cma API with the following options: > > The backlog applies more for iWarp and userspace. I couldn't find a usable > way > to make use of backlog in the kernel, since it uses a callback model. > > >rdma_backlog_added - connection was added to backlog queue > >rdma_backlog_removed - connection was removed from backlog queue > > *ponders* > > >Internally, CMA will count the # of connections in backlog. If > >If REQ arrives and this number exceeds the backlog given in listen, > >CMA will drop the REQ, without creating the new CMA ID. > > Incrementing the number of pending connections on a listen is easy. > Decrementing it is more difficult, since a listen request can be destroyed > after > a connection request is received, but before it is responded to. This is > difficult to handle, especially for userspace clients.
That is why, in my opinion, this should be up to the ULP to handle, calling rdma_backlog_added/rdma_backlog_removed as appropriate. Existing ULPs that don't call rdma_backlog_added will simply get all requests. > Additionally, the CMA can't just drop the REQ. The REQ has been received by > the > IB CM, which is expecting a response. You would need to push backlog into the > IB CM, which requires defining what it means at that level. From the > perspective of the IB CM, sending a REJ with "No resources available" (reject > code 3) seems to make more sense than simply discarding the MAD. This approach would affect all ULPs, however. For example, no SDP imlementation that I know of retries after a REJ - so this approach won't be interoperable. And AFAIK SDP spec already interprets reject as connection refused. There's no provision I cansee in SDP spec for retries on specific reject code. Dropping REQ simply seems a nice approach since client retries REQ MADs anyway. > One possible fix is to remove sending a reject on destruction of a cm_id. I'm > not sure what effect this would have on other code or the overall protocol > though. Yes, that was my thinking. To avoid touching all users, maybe the simplest way is to make ib_cm discard the new cm_id without reject if the client callback returned -ENOMEM? If you consider that in out of memory situation sending reject will also likely fail, this might be a good idea, regardless. Sounds good? -- MST _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
