Re: [PATCH] tun: Use netif_receive_skb instead of netif_rx
On Tue, Nov 29, 2016 at 5:20 PM, Michael S. Tsirkin wrote: > On Tue, Nov 29, 2016 at 04:25:36PM +0100, Andrey Konovalov wrote: >> This patch changes tun.c to call netif_receive_skb instead of netif_rx >> when a packet is received. The difference between the two is that netif_rx >> queues the packet into the backlog, and netif_receive_skb proccesses the >> packet in the current context. >> >> This patch is required for syzkaller [1] to collect coverage from packet >> receive paths, when a packet being received through tun (syzkaller collects >> coverage per process in the process context). >> >> A similar patch was introduced back in 2010 [2, 3], but the author found >> out that the patch doesn't help with the task he had in mind (for cgroups >> to shape network traffic based on the original process) and decided not to >> go further with it. The main concern back then was about possible stack >> exhaustion with 4K stacks, but CONFIG_4KSTACKS was removed and stacks are >> 8K now. >> >> [1] https://github.com/google/syzkaller >> >> [2] https://www.spinics.net/lists/netdev/thrd440.html#130570 >> >> [3] https://www.spinics.net/lists/netdev/msg130570.html >> >> Signed-off-by: Andrey Konovalov > > This was on my list of things to investigate ever since > 8k stack default went in. Thanks for looking into this! > > I note that there are still seem to exit 3 architectures that do have > CONFIG_4KSTACKS. Hi Michael, Missed that, mailed v2. Thanks! > > How about a wrapper that does netif_rx_ni with CONFIG_4KSTACKS. > > >> --- >> drivers/net/tun.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/net/tun.c b/drivers/net/tun.c >> index 8093e39..4b56e91 100644 >> --- a/drivers/net/tun.c >> +++ b/drivers/net/tun.c >> @@ -1304,7 +1304,9 @@ static ssize_t tun_get_user(struct tun_struct *tun, >> struct tun_file *tfile, >> skb_probe_transport_header(skb, 0); >> >> rxhash = skb_get_hash(skb); >> - netif_rx_ni(skb); >> + local_bh_disable(); >> + netif_receive_skb(skb); >> + local_bh_enable(); >> >> stats = get_cpu_ptr(tun->pcpu_stats); >> u64_stats_update_begin(&stats->syncp); >> -- >> 2.8.0.rc3.226.g39d4020 > > -- > You received this message because you are subscribed to the Google Groups > "syzkaller" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to syzkaller+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout.
Re: [PATCH] tun: Use netif_receive_skb instead of netif_rx
On Tue, Nov 29, 2016 at 04:25:36PM +0100, Andrey Konovalov wrote: > This patch changes tun.c to call netif_receive_skb instead of netif_rx > when a packet is received. The difference between the two is that netif_rx > queues the packet into the backlog, and netif_receive_skb proccesses the > packet in the current context. > > This patch is required for syzkaller [1] to collect coverage from packet > receive paths, when a packet being received through tun (syzkaller collects > coverage per process in the process context). > > A similar patch was introduced back in 2010 [2, 3], but the author found > out that the patch doesn't help with the task he had in mind (for cgroups > to shape network traffic based on the original process) and decided not to > go further with it. The main concern back then was about possible stack > exhaustion with 4K stacks, but CONFIG_4KSTACKS was removed and stacks are > 8K now. > > [1] https://github.com/google/syzkaller > > [2] https://www.spinics.net/lists/netdev/thrd440.html#130570 > > [3] https://www.spinics.net/lists/netdev/msg130570.html > > Signed-off-by: Andrey Konovalov This was on my list of things to investigate ever since 8k stack default went in. Thanks for looking into this! I note that there are still seem to exit 3 architectures that do have CONFIG_4KSTACKS. How about a wrapper that does netif_rx_ni with CONFIG_4KSTACKS. > --- > drivers/net/tun.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/tun.c b/drivers/net/tun.c > index 8093e39..4b56e91 100644 > --- a/drivers/net/tun.c > +++ b/drivers/net/tun.c > @@ -1304,7 +1304,9 @@ static ssize_t tun_get_user(struct tun_struct *tun, > struct tun_file *tfile, > skb_probe_transport_header(skb, 0); > > rxhash = skb_get_hash(skb); > - netif_rx_ni(skb); > + local_bh_disable(); > + netif_receive_skb(skb); > + local_bh_enable(); > > stats = get_cpu_ptr(tun->pcpu_stats); > u64_stats_update_begin(&stats->syncp); > -- > 2.8.0.rc3.226.g39d4020
Re: [PATCH] tun: Use netif_receive_skb instead of netif_rx
On Tue, 2016-11-29 at 16:25 +0100, Andrey Konovalov wrote: > This patch changes tun.c to call netif_receive_skb instead of netif_rx > when a packet is received. The difference between the two is that netif_rx > queues the packet into the backlog, and netif_receive_skb proccesses the > packet in the current context. > > This patch is required for syzkaller [1] to collect coverage from packet > receive paths, when a packet being received through tun (syzkaller collects > coverage per process in the process context). > > A similar patch was introduced back in 2010 [2, 3], but the author found > out that the patch doesn't help with the task he had in mind (for cgroups > to shape network traffic based on the original process) and decided not to > go further with it. The main concern back then was about possible stack > exhaustion with 4K stacks, but CONFIG_4KSTACKS was removed and stacks are > 8K now. Acked-by: Eric Dumazet We're using a similar patch written by Peter , let me copy here part of his changelog, since main motivation was speed improvement at that time : commit 29aa09f47d43e93327a706cd835a37012ccc5b9e Author: Peter Klausler Date: Fri Mar 29 17:08:02 2013 -0400 net-tun: Add netif_rx_ni_immediate() variant to speed up tun/tap. Speed up packet reception from (i.e., writes to) a tun/tap device by adding an alternative netif_rx_ni_immediate() interface that invokes netif_receive_skb() immediately rather than enqueueing the packet in the backlog queue and then driving its processing with do_softirq(). Forced queueing as a consequence of an RPS CPU mask will still work as expected. This change speeds up my closed-loop single-stream tap/OVS benchmark by about 23%, from 700k packets/second to 867k packets/second.