On 2018年02月23日 01:46, Jesper Dangaard Brouer wrote:
On Thu, 22 Feb 2018 17:36:46 +0800
Jason Wang <jasow...@redhat.com> wrote:
Commit 762c330d670e ("tuntap: add missing xdp flush") tries to fix the
devmap stall caused by missed xdp flush by counting the pending xdp
redirected packets and flush when it exceeds NAPI_POLL_WEIGHT or
MSG_MORE is clear. This may lead to BUG() since xdp_do_flush() was
called in the process context with preemption enabled. Simply
disabling preemption may silence the warning but be not enough since
process may move between different CPUS during a batch which cause
xdp_do_flush() misses some CPU where the process run
previously. Consider the fallouts, that commit was reverted. To fix
the issue correctly, we can simply call xdp_do_flush() immediately
after xdp_do_redirect(), a side effect is that this removes any
possibility of batching which could be addressed in the future.
Reported-by: Christoffer Dall <christoffer.d...@linaro.org>
Fixes: 762c330d670e ("tuntap: add missing xdp flush")
Signed-off-by: Jason Wang <jasow...@redhat.com>
drivers/net/tun.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 2823a4a..a363ea2 100644
@@ -1662,6 +1662,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct
alloc_frag->offset += buflen;
err = xdp_do_redirect(tun->dev, &xdp, xdp_prog);
As you have noticed, the xdp_do_redirect() + xdp_do_flush_map() rely
heavily on being executed in softirq/napi_schedule context.
Particularly the map infra devmap+cpumap depend on the enqueue and
flush operation MUST happen on the same CPU (e.g. stores which
devices needs flushing in a this_cpu_ptr bitmap ).
What context is tun_build_skb() invoked under?
Even when you call xdp_do_redirect and xdp_do_flush_map right after
each-other, are we sure we cannot be preempted here?
Ok, I miss the fact that we can be preempted here with preemptible RCU.
Let me disable preemption here and post a V4.