Eli Cohen wrote: > On Tue, Sep 23, 2008 at 02:01:00PM +0300, Moni Shoua wrote: >> Eli Cohen wrote: >>> Commit ee1e2c82c245a5fb2864e9dbcdaab3390fde3fcc introduced an >>> optimization on path flushing. This caused a new possible scenario in >>> which unicast_arp_send triggers path query which could fail, causing >>> path->ah to become NULL. A successive successfull path query will then >>> trigger WARN_ON() in path_rec_completion(). This fix requires old_ah >>> to differ from NULL as a prerequsite to trigger the WARN_ON(). >>> Moreover, that commit also allowed path resolution to be triggered for >>> an invalid path; if that path resolution failed, old_ah would be freed >>> outside priv->lock violating the assumption that dropping references >>> inside the lock are guaranteed not to reach zero reference. >>> >> Eli Roland, >> I understand that this patch is going to be in OFED. >> What about upstream kernel? >> I'd like to add improvements to commit >> ee1e2c82c245a5fb2864e9dbcdaab3390fde3fcc (the one you referred to) and it >> will probably be on top of your fix. >> >> I'm sorry if I missed Roland's answer. >> > > I don't think Roland responded to this patch yet. Still, I think it is > important that this patch is reviewed since we have a regression > relative to 2.6.26. > _______________________________________________ > ewg mailing list > [email protected] > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg > I agree I gave a thought here. It's possible, when path_rec_completion() is called with nonzero status, to do nothing with ah. Only when path query finishes with success do the replacement. This is good for cases when old_ah is still good (no remote LID change happened).
Besides that I think that the patch is correct. _______________________________________________ ewg mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
