On Wed, Oct 5, 2011 at 12:39 PM, Pravin B Shelar <[email protected]> wrote: > Following patch make sure that BH is disabled while running > execute_actions() on packets from userspace. RCU annotation for BH is > used to fix false positive warning message. > > Signed-off-by: Pravin B Shelar <[email protected]> > Bug #7621
I spent quite a bit of time looking at this and reading the RCU implementation today and I think that we actually have a correctness problem that existed before this patch but this makes it worse. In certain situations, the lockdep warning is actually not a false positive because the difference between rcu_read_lock and rcu_read_lock_bh is more than disabling bottom halves. In most situations, we are fine because rcu_read_lock amounts to preempt_disable() and the _bh version is local_bh_disable(). In these cases our interchanged use of them doesn't matter as long as you don't assume that bottom halves are disabled, which we don't. However, with CONFIG_PREEMPT_RCU rcu_read_lock turns into a counter, while _bh continues to only disable bottom halves. This means that rcu_read_lock_bh is essentially invisible to call_rcu in this mode and we can have a use after free bug. As a result, there are two completely separate issues here: * Disabling of bottom halves. The previous assumptions here were correct - all received packets with the exception of those coming from our userspace have bottom halves disabled because transmit from the network stack uses rcu_read_lock_bh and receive occurs in softirq context. * Handling of rcu_read_lock: We need to be more careful here about mixing and matching. For the most part it probably doesn't really matter all that much since we largely already run with bottom halves disabled and don't generally need faster grace periods. The two exceptions are: - Whether we want to disable bottom halves in all the various places that access our data structures (such as reading port config). - Getting faster grace periods for removed flows. This is the one area that it might be useful to have faster grace periods since we can accumulate a lot of garbage with a DoS attack. This may not matter all that much because userspace has to delete the flows, which gives us an opportunity for a grace period. Running with a preemptible kernel could be bad though because a lower priority task could hold rcu_read_lock and prevent a grace period (hopefully fixed by using CONFIG_RCU_BOOST). Given all that, what I would do is: * Use the non-bh versions of the rcu functions as we currently do and take rcu_read_lock around vport_receive() in internal_dev_xmit(). We should then run lockdep to make sure that it doesn't flag warnings any more. * Still disable bottom halves for packets coming from userspace as you have done here. I think the removal of _bh from locks can be moved into this patch and there are some other places like the in_interrupt() in vport-internal_dev.c can be removed. _______________________________________________ dev mailing list [email protected] http://openvswitch.org/mailman/listinfo/dev
