On 2/2/2012 3:21 AM, David Miller wrote:
From: Roland Dreier<[email protected]>

I'm happy to do it but I'm still not quite sure I understand what the
end state is.  Is it just a matter of peeking into the skb contents
to get at the daddr, looking up the neigh based on that and then
continuing to handle neighs as we do now?

Right, you also will now have a reference to the neigh so you must release it.

Does that also work for ARP packets?

ARP packets have no dst_entry, so would need to be handled specially right now 
anyways.

Are there any examples in your tree I can crib off of?

grep for dst_neigh_lookup() in the net-next tree.

Hi Dave, Roland,

As was indicated over the thread @ http://marc.info/?t=132812805900004&r=1&w=2 the current situation can surely lead to races, and I'd be happy to revive it, hopefully towards somehow addressing the issue.From my reading through the thread, I understand that more or less the following paths (...) were suggested:

#1 convert ipoib to use dst_neigh_lookup instead of dst_get_neighbour_noref
#2 do some RCU-ing where we don't actually free the ipoib_neigh immediately, etc

Dave commented that #1 conversion step will allow to merge a change where dst_entry objects will not have an attached neighbour any more. Alternatively, IPoIB could manage a data structure where ipoib_neighs are looked up the packet ipv4/ipv6 header and not from the neighbour.

Performance wise, it seems that the two suggestions should introduce the ~same overhead, since in the ipv4 case for example, from dst_neigh_lookup we would land in __ipv4_neigh_lookup which does a hash lookup and we ipoib doesn't use the neighbours any more, some data structure lookup (e.g hash) will be used to resolve the ipoib_neigh from the destination address.

Dave, as for the conversion to dst_neigh_lookup, looking on net-next, I see about 30 hits for dst_get_neighbour_noref vs 13 hits for dst_neigh_lookup, so I wasn't sure about your comment re the ipoib conversion being what actually remains to make that change happen, can you elaborate here a little further?

Also, if ipoib moves to use that api, you made a comment that it will have a reference on the neighbour which will need to be released. I took a look on the net/atm use case which should to some extent be similar to ipoib, I see that neigh_release is called for neighbours retrieved by that API, okay.

Just for the sake of example, does the atm code free from the races you mention re ipoib? I see that it does stick its own object on the neighbour through the priv pointer but wasn't sure if it helps in that respect without further RCU-ing things.


Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to