Re: [PATCH] tun: Use netif_receive_skb instead of netif_rx

2016-12-01 Thread Andrey Konovalov
On Tue, Nov 29, 2016 at 5:20 PM, Michael S. Tsirkin  wrote:
> On Tue, Nov 29, 2016 at 04:25:36PM +0100, Andrey Konovalov wrote:
>> This patch changes tun.c to call netif_receive_skb instead of netif_rx
>> when a packet is received. The difference between the two is that netif_rx
>> queues the packet into the backlog, and netif_receive_skb proccesses the
>> packet in the current context.
>>
>> This patch is required for syzkaller [1] to collect coverage from packet
>> receive paths, when a packet being received through tun (syzkaller collects
>> coverage per process in the process context).
>>
>> A similar patch was introduced back in 2010 [2, 3], but the author found
>> out that the patch doesn't help with the task he had in mind (for cgroups
>> to shape network traffic based on the original process) and decided not to
>> go further with it. The main concern back then was about possible stack
>> exhaustion with 4K stacks, but CONFIG_4KSTACKS was removed and stacks are
>> 8K now.
>>
>> [1] https://github.com/google/syzkaller
>>
>> [2] https://www.spinics.net/lists/netdev/thrd440.html#130570
>>
>> [3] https://www.spinics.net/lists/netdev/msg130570.html
>>
>> Signed-off-by: Andrey Konovalov 
>
> This was on my list of things to investigate ever since
> 8k stack default went in. Thanks for looking into this!
>
> I note that there are still seem to exit 3 architectures that do have
> CONFIG_4KSTACKS.

Hi Michael,

Missed that, mailed v2.

Thanks!

>
> How about a wrapper that does netif_rx_ni with CONFIG_4KSTACKS.
>
>
>> ---
>>  drivers/net/tun.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>> index 8093e39..4b56e91 100644
>> --- a/drivers/net/tun.c
>> +++ b/drivers/net/tun.c
>> @@ -1304,7 +1304,9 @@ static ssize_t tun_get_user(struct tun_struct *tun, 
>> struct tun_file *tfile,
>>   skb_probe_transport_header(skb, 0);
>>
>>   rxhash = skb_get_hash(skb);
>> - netif_rx_ni(skb);
>> + local_bh_disable();
>> + netif_receive_skb(skb);
>> + local_bh_enable();
>>
>>   stats = get_cpu_ptr(tun->pcpu_stats);
>>   u64_stats_update_begin(&stats->syncp);
>> --
>> 2.8.0.rc3.226.g39d4020
>
> --
> You received this message because you are subscribed to the Google Groups 
> "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to syzkaller+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.


Re: [PATCH] tun: Use netif_receive_skb instead of netif_rx

2016-11-29 Thread Michael S. Tsirkin
On Tue, Nov 29, 2016 at 04:25:36PM +0100, Andrey Konovalov wrote:
> This patch changes tun.c to call netif_receive_skb instead of netif_rx
> when a packet is received. The difference between the two is that netif_rx
> queues the packet into the backlog, and netif_receive_skb proccesses the
> packet in the current context.
> 
> This patch is required for syzkaller [1] to collect coverage from packet
> receive paths, when a packet being received through tun (syzkaller collects
> coverage per process in the process context).
> 
> A similar patch was introduced back in 2010 [2, 3], but the author found
> out that the patch doesn't help with the task he had in mind (for cgroups
> to shape network traffic based on the original process) and decided not to
> go further with it. The main concern back then was about possible stack
> exhaustion with 4K stacks, but CONFIG_4KSTACKS was removed and stacks are
> 8K now.
> 
> [1] https://github.com/google/syzkaller
> 
> [2] https://www.spinics.net/lists/netdev/thrd440.html#130570
> 
> [3] https://www.spinics.net/lists/netdev/msg130570.html
> 
> Signed-off-by: Andrey Konovalov 

This was on my list of things to investigate ever since
8k stack default went in. Thanks for looking into this!

I note that there are still seem to exit 3 architectures that do have
CONFIG_4KSTACKS.

How about a wrapper that does netif_rx_ni with CONFIG_4KSTACKS.


> ---
>  drivers/net/tun.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 8093e39..4b56e91 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1304,7 +1304,9 @@ static ssize_t tun_get_user(struct tun_struct *tun, 
> struct tun_file *tfile,
>   skb_probe_transport_header(skb, 0);
>  
>   rxhash = skb_get_hash(skb);
> - netif_rx_ni(skb);
> + local_bh_disable();
> + netif_receive_skb(skb);
> + local_bh_enable();
>  
>   stats = get_cpu_ptr(tun->pcpu_stats);
>   u64_stats_update_begin(&stats->syncp);
> -- 
> 2.8.0.rc3.226.g39d4020


Re: [PATCH] tun: Use netif_receive_skb instead of netif_rx

2016-11-29 Thread Eric Dumazet
On Tue, 2016-11-29 at 16:25 +0100, Andrey Konovalov wrote:
> This patch changes tun.c to call netif_receive_skb instead of netif_rx
> when a packet is received. The difference between the two is that netif_rx
> queues the packet into the backlog, and netif_receive_skb proccesses the
> packet in the current context.
> 
> This patch is required for syzkaller [1] to collect coverage from packet
> receive paths, when a packet being received through tun (syzkaller collects
> coverage per process in the process context).
> 
> A similar patch was introduced back in 2010 [2, 3], but the author found
> out that the patch doesn't help with the task he had in mind (for cgroups
> to shape network traffic based on the original process) and decided not to
> go further with it. The main concern back then was about possible stack
> exhaustion with 4K stacks, but CONFIG_4KSTACKS was removed and stacks are
> 8K now.

Acked-by: Eric Dumazet 

We're using a similar patch written by Peter , let me copy here part of
his changelog, since main motivation was speed improvement at that
time :

commit 29aa09f47d43e93327a706cd835a37012ccc5b9e
Author: Peter Klausler 
Date:   Fri Mar 29 17:08:02 2013 -0400

net-tun: Add netif_rx_ni_immediate() variant to speed up tun/tap.

Speed up packet reception from (i.e., writes to) a tun/tap
device by adding an alternative netif_rx_ni_immediate()
interface that invokes netif_receive_skb() immediately rather
than enqueueing the packet in the backlog queue and then driving
its processing with do_softirq().  Forced queueing as a consequence
of an RPS CPU mask will still work as expected.

This change speeds up my closed-loop single-stream tap/OVS benchmark
by about 23%, from 700k packets/second to 867k packets/second.