Sean Hefty wrote: > Eitan Zahavi wrote: > >> I disagree. If you sniff at the MAD level you can simply react to the >> lower level messages. >> > > First, when designing this, I did consider using the MAD snooping ability, > and > changing what could be done with snooping. However, the multicast handling > is > not simply sniffing MADs going out on the wire and incrementing / > decrementing > some count. It can change or prevent a MAD from being sent. This is a > fundamental change to the behavior of the ib_mad APIs. > I am sorry I was not involved in that early stage. My bad. I need to look deeper into the code. As long as a response is generated even though the MAD was not sent this is not an API change but a bug fix. In this stage it seems that only a patch would convince you otherwise. I will try working on it this week. What I had in mind was to provide back a MAD response in the case of delete when the client is not the last one on the group. All other MADs go on the wire (duplicate "join"). > MADs are sent and tracked by their respective registered ib_mad clients. Exactly and the agent ID is part of the MAD trans_id. So we know which agent is sending which MAD. > Trying > to push this down into the MAD layer means that the send request from one > client > may now occur on some other client's registration. Not sure I am following you here. If you refer to the race where one client sends "join" while the other sends "leave" you should make sure: 1. Mark a client as "joined" only after receiving the SA response. 2. Consider a "leave" when the client MAD is sent out.
> If that client decides to > unregister in the middle of their send, the operation is canceled, and now > needs > to be restarted on some other registration. And even though the operation > was > canceled, we still need to know whether it was seen by the SA. This requires > sniffing all MADs, and quickly gets extremely complex. > Cancel does not really revert a post_send. Isn't it? So if we catch it just before it is posting we should be fine. > In order to avoid issues these with which registered client is actually > performing the operation, the solution is to filter multicast requests > through a > single registration. If each client uses its own agent ID then it is available in the trans_id of the MAD. > The ib_mad layer is complex enough as it is. (Have you > tried tracing a MAD through the send path?) We don't need to push even more > functionality down into it. > I agree that layering on top is easier. But does it really solve the bug? I think not. If you would REPLACE the API and not provide both options (above and below refcount enforcement ) it would make sense to me. > - Sean > > _______________________________________________ > openib-general mailing list > [email protected] > http://openib.org/mailman/listinfo/openib-general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
