Re: [PATCH stable 4.9 v2 00/29] backport of IP fragmentation fixes
On Mon, Oct 15, 2018 at 10:53:02AM -0700, Eric Dumazet wrote: > On Mon, Oct 15, 2018 at 10:47 AM Florian Fainelli > wrote: > > > > > > > > On 10/10/2018 12:29 PM, Florian Fainelli wrote: > > > This is based on Stephen's v4.14 patches, with the necessary merge > > > conflicts, and the lack of timer_setup() on the 4.9 baseline. > > > > > > Perf results on a gigabit capable system, before and after are below. > > > > > > Series can also be found here: > > > > > > https://github.com/ffainelli/linux/commits/fragment-stack-v4.9-v2 > > > > > > Changes in v2: > > > > > > - drop "net: sk_buff rbnode reorg" > > > - added original "ip: use rb trees for IP frag queue." commit > > > > Eric, does this look reasonable to you? > > Yes, thanks a lot Florian. Wonderful, all now queued up, thanks! greg k-h
Re: [PATCH stable 4.9 v2 00/29] backport of IP fragmentation fixes
On Mon, Oct 15, 2018 at 10:47 AM Florian Fainelli wrote: > > > > On 10/10/2018 12:29 PM, Florian Fainelli wrote: > > This is based on Stephen's v4.14 patches, with the necessary merge > > conflicts, and the lack of timer_setup() on the 4.9 baseline. > > > > Perf results on a gigabit capable system, before and after are below. > > > > Series can also be found here: > > > > https://github.com/ffainelli/linux/commits/fragment-stack-v4.9-v2 > > > > Changes in v2: > > > > - drop "net: sk_buff rbnode reorg" > > - added original "ip: use rb trees for IP frag queue." commit > > Eric, does this look reasonable to you? Yes, thanks a lot Florian. > > > > > Before patches: > > > >PerfTop: 180 irqs/sec kernel:78.9% exact: 0.0% [4000Hz > > cycles:ppp], (all, 4 CPUs) > > --- > > > > 34.81% [kernel] [k] ip_defrag > > 4.57% [kernel] [k] arch_cpu_idle > > 2.09% [kernel] [k] fib_table_lookup > > 1.74% [kernel] [k] finish_task_switch > > 1.57% [kernel] [k] v7_dma_inv_range > > 1.47% [kernel] [k] __netif_receive_skb_core > > 1.06% [kernel] [k] __slab_free > > 1.04% [kernel] [k] __netdev_alloc_skb > > 0.99% [kernel] [k] ip_route_input_noref > > 0.96% [kernel] [k] dev_gro_receive > > 0.96% [kernel] [k] tick_nohz_idle_enter > > 0.93% [kernel] [k] bcm_sysport_poll > > 0.92% [kernel] [k] skb_release_data > > 0.91% [kernel] [k] __memzero > > 0.90% [kernel] [k] __free_page_frag > > 0.87% [kernel] [k] ip_rcv > > 0.77% [kernel] [k] eth_type_trans > > 0.71% [kernel] [k] _raw_spin_unlock_irqrestore > > 0.68% [kernel] [k] tick_nohz_idle_exit > > 0.65% [kernel] [k] bcm_sysport_rx_refill > > > > After patches: > > > >PerfTop: 214 irqs/sec kernel:80.4% exact: 0.0% [4000Hz > > cycles:ppp], (all, 4 CPUs) > > --- > > > > 6.61% [kernel] [k] arch_cpu_idle > > 3.77% [kernel] [k] ip_defrag > > 3.65% [kernel] [k] v7_dma_inv_range > > 3.18% [kernel] [k] fib_table_lookup > > 3.04% [kernel] [k] __netif_receive_skb_core > > 2.31% [kernel] [k] finish_task_switch > > 2.31% [kernel] [k] _raw_spin_unlock_irqrestore > > 1.65% [kernel] [k] bcm_sysport_poll > > 1.63% [kernel] [k] ip_route_input_noref > > 1.63% [kernel] [k] __memzero > > 1.58% [kernel] [k] __netdev_alloc_skb > > 1.47% [kernel] [k] tick_nohz_idle_enter > > 1.40% [kernel] [k] __slab_free > > 1.32% [kernel] [k] ip_rcv > > 1.32% [kernel] [k] __softirqentry_text_start > > 1.30% [kernel] [k] dev_gro_receive > > 1.23% [kernel] [k] bcm_sysport_rx_refill > > 1.11% [kernel] [k] tick_nohz_idle_exit > > 1.06% [kernel] [k] memcmp > > 1.02% [kernel] [k] dma_cache_maint_page > > > > > > Dan Carpenter (1): > > ipv4: frags: precedence bug in ip_expire() > > > > Eric Dumazet (21): > > inet: frags: change inet_frags_init_net() return value > > inet: frags: add a pointer to struct netns_frags > > inet: frags: refactor ipfrag_init() > > inet: frags: refactor ipv6_frag_init() > > inet: frags: refactor lowpan_net_frag_init() > > ipv6: export ip6 fragments sysctl to unprivileged users > > rhashtable: add schedule points > > inet: frags: use rhashtables for reassembly units > > inet: frags: remove some helpers > > inet: frags: get rif of inet_frag_evicting() > > inet: frags: remove inet_frag_maybe_warn_overflow() > > inet: frags: break the 2GB limit for frags storage > > inet: frags: do not clone skb in ip_expire() > > ipv6: frags: rewrite ip6_expire_frag_queue() > > rhashtable: reorganize struct rhashtable layout > > inet: frags: reorganize struct netns_frags > > inet: frags: get rid of ipfrag_skb_cb/FRAG_CB > > inet: frags: fix ip6frag_low_thresh boundary > > net: speed up skb_rbtree_purge() > > net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends > > net: add rb_to_skb() and other rb tree helpers > > > > Florian Westphal (1): > > ipv6: defrag: drop non-last frags smaller than min mtu > > > > Peter Oskolkov (5): > > ip: discard IPv4 datagrams with overlapping segments. > > net: modify skb_rbtree_purge to return the truesize of all purged > > skbs. > > ip: use rb trees for IP frag queue. > > ip: add helpers to process in-order fragments faster. > > ip: process in-order fragments efficiently > > > > Taehee Yoo (1): > > ip: frags: fix crash in ip_do_fragment() > > > > Documentation/networking/ip-sysctl.txt | 13 +- > > include/linux/rhashtable.h | 4 +- > > include/linux/skbuff.h
Re: [PATCH stable 4.9 v2 00/29] backport of IP fragmentation fixes
On 10/10/2018 12:29 PM, Florian Fainelli wrote: > This is based on Stephen's v4.14 patches, with the necessary merge > conflicts, and the lack of timer_setup() on the 4.9 baseline. > > Perf results on a gigabit capable system, before and after are below. > > Series can also be found here: > > https://github.com/ffainelli/linux/commits/fragment-stack-v4.9-v2 > > Changes in v2: > > - drop "net: sk_buff rbnode reorg" > - added original "ip: use rb trees for IP frag queue." commit Eric, does this look reasonable to you? > > Before patches: > >PerfTop: 180 irqs/sec kernel:78.9% exact: 0.0% [4000Hz cycles:ppp], > (all, 4 CPUs) > --- > > 34.81% [kernel] [k] ip_defrag > 4.57% [kernel] [k] arch_cpu_idle > 2.09% [kernel] [k] fib_table_lookup > 1.74% [kernel] [k] finish_task_switch > 1.57% [kernel] [k] v7_dma_inv_range > 1.47% [kernel] [k] __netif_receive_skb_core > 1.06% [kernel] [k] __slab_free > 1.04% [kernel] [k] __netdev_alloc_skb > 0.99% [kernel] [k] ip_route_input_noref > 0.96% [kernel] [k] dev_gro_receive > 0.96% [kernel] [k] tick_nohz_idle_enter > 0.93% [kernel] [k] bcm_sysport_poll > 0.92% [kernel] [k] skb_release_data > 0.91% [kernel] [k] __memzero > 0.90% [kernel] [k] __free_page_frag > 0.87% [kernel] [k] ip_rcv > 0.77% [kernel] [k] eth_type_trans > 0.71% [kernel] [k] _raw_spin_unlock_irqrestore > 0.68% [kernel] [k] tick_nohz_idle_exit > 0.65% [kernel] [k] bcm_sysport_rx_refill > > After patches: > >PerfTop: 214 irqs/sec kernel:80.4% exact: 0.0% [4000Hz cycles:ppp], > (all, 4 CPUs) > --- > > 6.61% [kernel] [k] arch_cpu_idle > 3.77% [kernel] [k] ip_defrag > 3.65% [kernel] [k] v7_dma_inv_range > 3.18% [kernel] [k] fib_table_lookup > 3.04% [kernel] [k] __netif_receive_skb_core > 2.31% [kernel] [k] finish_task_switch > 2.31% [kernel] [k] _raw_spin_unlock_irqrestore > 1.65% [kernel] [k] bcm_sysport_poll > 1.63% [kernel] [k] ip_route_input_noref > 1.63% [kernel] [k] __memzero > 1.58% [kernel] [k] __netdev_alloc_skb > 1.47% [kernel] [k] tick_nohz_idle_enter > 1.40% [kernel] [k] __slab_free > 1.32% [kernel] [k] ip_rcv > 1.32% [kernel] [k] __softirqentry_text_start > 1.30% [kernel] [k] dev_gro_receive > 1.23% [kernel] [k] bcm_sysport_rx_refill > 1.11% [kernel] [k] tick_nohz_idle_exit > 1.06% [kernel] [k] memcmp > 1.02% [kernel] [k] dma_cache_maint_page > > > Dan Carpenter (1): > ipv4: frags: precedence bug in ip_expire() > > Eric Dumazet (21): > inet: frags: change inet_frags_init_net() return value > inet: frags: add a pointer to struct netns_frags > inet: frags: refactor ipfrag_init() > inet: frags: refactor ipv6_frag_init() > inet: frags: refactor lowpan_net_frag_init() > ipv6: export ip6 fragments sysctl to unprivileged users > rhashtable: add schedule points > inet: frags: use rhashtables for reassembly units > inet: frags: remove some helpers > inet: frags: get rif of inet_frag_evicting() > inet: frags: remove inet_frag_maybe_warn_overflow() > inet: frags: break the 2GB limit for frags storage > inet: frags: do not clone skb in ip_expire() > ipv6: frags: rewrite ip6_expire_frag_queue() > rhashtable: reorganize struct rhashtable layout > inet: frags: reorganize struct netns_frags > inet: frags: get rid of ipfrag_skb_cb/FRAG_CB > inet: frags: fix ip6frag_low_thresh boundary > net: speed up skb_rbtree_purge() > net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends > net: add rb_to_skb() and other rb tree helpers > > Florian Westphal (1): > ipv6: defrag: drop non-last frags smaller than min mtu > > Peter Oskolkov (5): > ip: discard IPv4 datagrams with overlapping segments. > net: modify skb_rbtree_purge to return the truesize of all purged > skbs. > ip: use rb trees for IP frag queue. > ip: add helpers to process in-order fragments faster. > ip: process in-order fragments efficiently > > Taehee Yoo (1): > ip: frags: fix crash in ip_do_fragment() > > Documentation/networking/ip-sysctl.txt | 13 +- > include/linux/rhashtable.h | 4 +- > include/linux/skbuff.h | 34 +- > include/net/inet_frag.h | 133 +++--- > include/net/ip.h| 1 - > include/net/ipv6.h | 26 +- > include/uapi/linux/snmp.h | 1 + > lib/rhashtable.c| 5 +- > net/core/skbuff.c
[PATCH stable 4.9 v2 00/29] backport of IP fragmentation fixes
This is based on Stephen's v4.14 patches, with the necessary merge conflicts, and the lack of timer_setup() on the 4.9 baseline. Perf results on a gigabit capable system, before and after are below. Series can also be found here: https://github.com/ffainelli/linux/commits/fragment-stack-v4.9-v2 Changes in v2: - drop "net: sk_buff rbnode reorg" - added original "ip: use rb trees for IP frag queue." commit Before patches: PerfTop: 180 irqs/sec kernel:78.9% exact: 0.0% [4000Hz cycles:ppp], (all, 4 CPUs) --- 34.81% [kernel] [k] ip_defrag 4.57% [kernel] [k] arch_cpu_idle 2.09% [kernel] [k] fib_table_lookup 1.74% [kernel] [k] finish_task_switch 1.57% [kernel] [k] v7_dma_inv_range 1.47% [kernel] [k] __netif_receive_skb_core 1.06% [kernel] [k] __slab_free 1.04% [kernel] [k] __netdev_alloc_skb 0.99% [kernel] [k] ip_route_input_noref 0.96% [kernel] [k] dev_gro_receive 0.96% [kernel] [k] tick_nohz_idle_enter 0.93% [kernel] [k] bcm_sysport_poll 0.92% [kernel] [k] skb_release_data 0.91% [kernel] [k] __memzero 0.90% [kernel] [k] __free_page_frag 0.87% [kernel] [k] ip_rcv 0.77% [kernel] [k] eth_type_trans 0.71% [kernel] [k] _raw_spin_unlock_irqrestore 0.68% [kernel] [k] tick_nohz_idle_exit 0.65% [kernel] [k] bcm_sysport_rx_refill After patches: PerfTop: 214 irqs/sec kernel:80.4% exact: 0.0% [4000Hz cycles:ppp], (all, 4 CPUs) --- 6.61% [kernel] [k] arch_cpu_idle 3.77% [kernel] [k] ip_defrag 3.65% [kernel] [k] v7_dma_inv_range 3.18% [kernel] [k] fib_table_lookup 3.04% [kernel] [k] __netif_receive_skb_core 2.31% [kernel] [k] finish_task_switch 2.31% [kernel] [k] _raw_spin_unlock_irqrestore 1.65% [kernel] [k] bcm_sysport_poll 1.63% [kernel] [k] ip_route_input_noref 1.63% [kernel] [k] __memzero 1.58% [kernel] [k] __netdev_alloc_skb 1.47% [kernel] [k] tick_nohz_idle_enter 1.40% [kernel] [k] __slab_free 1.32% [kernel] [k] ip_rcv 1.32% [kernel] [k] __softirqentry_text_start 1.30% [kernel] [k] dev_gro_receive 1.23% [kernel] [k] bcm_sysport_rx_refill 1.11% [kernel] [k] tick_nohz_idle_exit 1.06% [kernel] [k] memcmp 1.02% [kernel] [k] dma_cache_maint_page Dan Carpenter (1): ipv4: frags: precedence bug in ip_expire() Eric Dumazet (21): inet: frags: change inet_frags_init_net() return value inet: frags: add a pointer to struct netns_frags inet: frags: refactor ipfrag_init() inet: frags: refactor ipv6_frag_init() inet: frags: refactor lowpan_net_frag_init() ipv6: export ip6 fragments sysctl to unprivileged users rhashtable: add schedule points inet: frags: use rhashtables for reassembly units inet: frags: remove some helpers inet: frags: get rif of inet_frag_evicting() inet: frags: remove inet_frag_maybe_warn_overflow() inet: frags: break the 2GB limit for frags storage inet: frags: do not clone skb in ip_expire() ipv6: frags: rewrite ip6_expire_frag_queue() rhashtable: reorganize struct rhashtable layout inet: frags: reorganize struct netns_frags inet: frags: get rid of ipfrag_skb_cb/FRAG_CB inet: frags: fix ip6frag_low_thresh boundary net: speed up skb_rbtree_purge() net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends net: add rb_to_skb() and other rb tree helpers Florian Westphal (1): ipv6: defrag: drop non-last frags smaller than min mtu Peter Oskolkov (5): ip: discard IPv4 datagrams with overlapping segments. net: modify skb_rbtree_purge to return the truesize of all purged skbs. ip: use rb trees for IP frag queue. ip: add helpers to process in-order fragments faster. ip: process in-order fragments efficiently Taehee Yoo (1): ip: frags: fix crash in ip_do_fragment() Documentation/networking/ip-sysctl.txt | 13 +- include/linux/rhashtable.h | 4 +- include/linux/skbuff.h | 34 +- include/net/inet_frag.h | 133 +++--- include/net/ip.h| 1 - include/net/ipv6.h | 26 +- include/uapi/linux/snmp.h | 1 + lib/rhashtable.c| 5 +- net/core/skbuff.c | 31 +- net/ieee802154/6lowpan/6lowpan_i.h | 26 +- net/ieee802154/6lowpan/reassembly.c | 148 +++--- net/ipv4/inet_fragment.c| 379 net/ipv4/ip_fragment.c | 573 +--- net/ipv4/proc.c | 7 +-