Hi all, I did investigate a bit further and traced the performance degradation back to linux iptables (nat): - using routing only I can push 930Mbit/sec through my router - calling iptables -t nat -vnL causes the kernel to load netfilter modules which is when the performance hit appears - No NAT rules are present, only loading the modules makes it reproducible - ksoftirqd uses around 15% CPU without iptables modules loaded and maxes out at 100% with iptables loaded - I tested nft, which has a little better performance within the margin of error
I am not sure yet if this is more of an e1000 mailinglist topic or some netfilter thing. I tried LEDE/PFsense which are both showing 600Mbit/sec synchronous without any optimization, however, I wanted to stay on Ubuntu due to my miniPCI wificards. Any pointers are greatly welcome. I have created two perf top profiles, that show significant time spent in ipt_do_table & tcp_in_window: 1) Before executing iptables -vnL -t nat (with 930Mbit/sec) Shared Object Symbol 8.76% [kernel] [k] acpi_idle_do_entry 2.71% [kernel] [k] igb_xmit_frame_ring 2.16% [kernel] [k] igb_clean_rx_irq 2.13% [kernel] [k] irq_entries_start 1.81% [kernel] [k] ipt_do_table 1.60% [kernel] [k] fib_table_lookup 1.56% [kernel] [k] __build_skb 1.45% [kernel] [k] memcpy 1.43% [kernel] [k] igb_poll 1.23% [kernel] [k] tcp_in_window 1.21% [kernel] [k] __netif_receive_skb_core 1.12% [kernel] [k] dev_gro_receive 1.07% [kernel] [k] menu_select 1.06% [kernel] [k] __slab_free 1.00% [kernel] [k] inet_gro_receive 0.98% [kernel] [k] __skb_flow_dissect 0.95% [kernel] [k] ip_route_input_noref 0.94% [kernel] [k] __dev_queue_xmit 0.93% [kernel] [k] ip_forward 0.90% [kernel] [k] tcp_gro_receive 0.88% [kernel] [k] kmem_cache_alloc 0.86% [kernel] [k] nf_conntrack_in 0.86% [kernel] [k] eth_get_headlen 0.82% [kernel] [k] ip_rcv 0.79% [kernel] [k] int_sqrt 0.69% [kernel] [k] skb_release_data 0.68% [kernel] [k] tcp_packet 0.68% [kernel] [k] read_tsc 0.64% [kernel] [k] __alloc_skb 0.60% [kernel] [k] skb_release_head_state 0.59% [kernel] [k] skb_segment 0.59% [kernel] [k] kmem_cache_free 0.56% [kernel] [k] __kmalloc_node_track_caller 0.55% [kernel] [k] ip_finish_output2 0.53% [kernel] [k] fib_validate_source 0.53% [kernel] [k] __dev_kfree_skb_any 0.50% [kernel] [k] tcp_error 2) After executing iptables -vnL -t nat (with 400Mbit/sec) 4.42% [kernel] [k] ipt_do_table 3.23% [kernel] [k] tcp_in_window 3.07% [kernel] [k] igb_xmit_frame_ring 3.03% [kernel] [k] __netif_receive_skb_core 2.71% [kernel] [k] fib_table_lookup 2.41% [kernel] [k] ip_route_input_noref 2.41% [kernel] [k] acpi_idle_do_entry 2.28% [kernel] [k] ip_forward 2.23% [kernel] [k] __dev_queue_xmit 2.10% [kernel] [k] nf_conntrack_in 2.08% [kernel] [k] dev_gro_receive 1.98% [kernel] [k] ip_rcv 1.65% [kernel] [k] tcp_packet 1.58% [kernel] [k] igb_clean_rx_irq 1.55% [kernel] [k] memcpy 1.52% [kernel] [k] __skb_flow_dissect 1.45% [kernel] [k] fib_validate_source 1.33% [kernel] [k] inet_gro_receive 1.29% [kernel] [k] ip_finish_output2 1.25% [kernel] [k] ip_rcv_finish 1.15% [kernel] [k] sch_direct_xmit 1.12% [kernel] [k] br_dev_xmit 1.09% [kernel] [k] __skb_get_hash 1.09% [kernel] [k] tcp_error 1.08% [kernel] [k] nf_hook_slow 1.04% [kernel] [k] __nf_conntrack_find_get 1.02% [kernel] [k] __local_bh_enable_ip 0.97% [kernel] [k] kmem_cache_alloc 0.95% [kernel] [k] __build_skb 0.91% [kernel] [k] nf_iterate 0.87% [kernel] [k] ip_finish_output 0.80% [kernel] [k] nf_nat_packet 0.79% [kernel] [k] dev_hard_start_xmit 0.77% [kernel] [k] netif_skb_features 0.77% [kernel] [k] irq_entries_start 0.76% [kernel] [k] nf_nat_ipv4_fn 0.73% [kernel] [k] __slab_free 0.72% [kernel] [k] tcp_v4_early_demux 0.68% [kernel] [k] netdev_pick_tx 0.63% [kernel] [k] __inet_lookup_established 0.61% [kernel] [k] __br_fdb_get 0.59% [kernel] [k] iptable_mangle_hook 0.56% [kernel] [k] igb_poll 0.55% [kernel] [k] swiotlb_map_page 0.55% [kernel] [k] ip_output 0.54% [kernel] [k] tcp_gro_receive 0.52% [kernel] [k] inet_ehashfn 0.51% [kernel] [k] __kmalloc_node_track_caller 0.51% [kernel] [k] find_exception cheers lIl Gesendet: Sonntag, 19. März 2017 um 13:08 Uhr Von: "Lil Evil" <lil_e...@gmx.de> An: e1000-devel@lists.sourceforge.net Betreff: [E1000-devel] Traffic (uni-directional) causes high ksoftirqd load on gigabit link - rootcause analysis Dear all, I am trying to get to the bottom of a performance bottle neck that I am experiencing, that seems to be related to packet processing on my Linux router. Sending TCP packets from Windows or MAC via my Ubuntu 16.04 router causes ksoftirq on my Linux router to eat up 100% CPU resources and cap the network throughput around 400Mbit.The reverse direction, or if the sending host is Linux on the same PC yields 900Mbit/s with little utilization on the Linux router.Using RSS and multiple TCP streams I could push 800Mbit/s through with 2 cores fully utilized. This was as much as I could do on the Linux router to improve throughput. I understand the i210/11 only supports 2 receive queues.It seems to be related to Window size below the window size of 48k in iperf3 the CPU load stays low, afterwards it jumps to 100% What is causing the ksoftirq CPU load in the sending direction coming from MAC/Windows? I would assume some re-ordering of large packets? Any pointers are very welcome! I can provide pcaps, but I haven’t been able to spot any anomalies. My test setup is as follows: MAC / PC (Windows, Ubuntu Live) <--direct--> Linux router (3x intel i210) <---direct cable---> NAS Linux router is a PCengine (AMD G series GX-412TC, 4x 1 GHz Jaguar core, 4GB RAM) I have done multiple performance tests in different setups with iperf3 and identified that the high ksoftirq load on the Linux router only appears if the sender is Windows or MAC. I have further successfully tested: Direct connection of MAC/PC <-switch-> NAS can yield 900Mbit/s both directions.PC (Ubuntu Live) <-> Linux router <-> NAS can yield 900Mbit/s both directions.Speedtest.net MAC/PC -> Linux router -> Internet (1Gbit/s) using speedtest yields 500Mbit upload with ksoftirq on the Linux router at 100% CPU load. MAC can achieve 900Mbit UP/DOWN when connected directly.I have not been able to get UDP MAC/PC <-switch-> NAS above 500 Mbit/s so I have not further tested UDP due to the high error rate.FreeBSD on the Linux router is able sustain 900Mbit/s UP/DOWN with little loadNo effect on the unidirectional load: Turning LRO/GRO on and offDisabled energy efficient EthernetDifferent Kernels (4.4.0.66 stock – tried various 3.6.xx and 4.8.x now running 4.10.2)IGB intel driver 5.3.0-k srcversion: 90ABA603B1D2A1415F2D301Different Linux congestion control algorithmInterruptThrottleRate on IGB enabled/disabledIncreased netdev.budget lscpi -v : 02:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03) Subsystem: Intel Corporation I210 Gigabit Network Connection Flags: bus master, fast devsel, latency 0, IRQ 33 Memory at fe600000 (32-bit, non-prefetchable) [size=128K] I/O ports at 1000 [size=32] Memory at fe620000 (32-bit, non-prefetchable) [size=16K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ Capabilities: [70] MSI-X: Enable+ Count=5 Masked- Capabilities: [a0] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [140] Device Serial Number 00-0d-b9-xx-xx-xx-xx-xx Capabilities: [1a0] Transaction Processing Hints Kernel driver in use: igb cat /proc/softirqs CPU0 CPU1 CPU2 CPU3 HI: 0 0 0 0 TIMER: 4141656 3676729 8708615 3009349 NET_TX: 8874 27600 5285 2013 NET_RX: 11019938 9055919 2988197 3129611 BLOCK: 45739 45952 46838 47832 IRQ_POLL: 0 0 0 0 TASKLET: 6303077 3902509 3918287 3987525 SCHED: 1545958 1264065 4493394 798169 HRTIMER: 0 0 0 0 RCU: 2350434 2287515 3839434 1842554 cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 34: 1 1 0 0 PCI-MSI 1048576-edge enp2s0 35: 9556594 4 4 3 PCI-MSI 1048577-edge enp2s0-TxRx-0 36: 2 7584066 2 2 PCI-MSI 1048578-edge enp2s0-TxRx-1 37: 0 2 1499447 3 PCI-MSI 1048579-edge enp2s0-TxRx-2 38: 3 3 3 1657668 PCI-MSI 1048580-edge enp2s0-TxRx-3 ethtool -x enp2s0 RX flow hash indirection table for enp2s0 with 4 RX ring(s): 0: 0 1 0 1 0 1 0 1 8: 0 1 0 1 0 1 0 1 16: 0 1 0 1 0 1 0 1 24: 0 1 0 1 0 1 0 1 32: 0 1 0 1 0 1 0 1 40: 0 1 0 1 0 1 0 1 48: 0 1 0 1 0 1 0 1 56: 0 1 0 1 0 1 0 1 64: 0 1 0 1 0 1 0 1 72: 0 1 0 1 0 1 0 1 80: 0 1 0 1 0 1 0 1 88: 0 1 0 1 0 1 0 1 96: 0 1 0 1 0 1 0 1 104: 0 1 0 1 0 1 0 1 112: 0 1 0 1 0 1 0 1 120: 0 1 0 1 0 1 0 1 ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel[https://lists.sourceforge.net/lists/listinfo/e1000-devel] To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired[http://communities.intel.com/community/wired] ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired