On Fri, 10 Apr 2015 15:47:34 +0200
Daniel Borkmann dan...@iogearbox.net wrote:
On 04/10/2015 02:15 PM, Pablo Neira Ayuso wrote:
This patch splits __netif_receive_skb_core() in smaller functions to improve
maintainability.
The function __netif_receive_skb_core() has been split in two:
* __netif_receive_skb_ingress(), to perform all actions up to
ingress filtering.
* __netif_receive_skb_finish(), if the ingress filter accepts this
packet, pass it to the corresponding packet_type function handler for
further
processing.
This patch also adds __NET_RX_ANOTHER_ROUND that is used when the packet is
stripped off from the vlan header or in case the rx_handler needs it.
This also prepares the introduction of the netfilter ingress hook.
Out of curiosity, what is actually the performance impact on all
of this? We were just arguing on a different matter on two more
instructions in the fast-path, here it's refactoring the whole
function into several ones, I presume gcc won't inline it.
Pablo asked me to performance test this change. Full test report below.
The performance effect (of this patch) depend on the Gcc compiler
version.
Two tests:
1. IP-forwarding (unloaded netfilter modules)
2. Early drop in iptables raw table
With GCC 4.4.7, which does not inline the new functions
(__netif_receive_skb_ingress and __netif_receive_skb_finish) the
performance impact/regression is definitly measurable.
With GCC 4.4.7:
1. IP-forwarding: +25.18 ns (slower) (-27776 pps)
2. Early-drop : +7.55 ns (slower) (-66577 pps)
With GCC 4.9.1, the new functions gets inlined, thus the refactor
splitup of __netif_receive_skb_core() is basically cancled.
Strangly there is a small improvement for forwarding, likely due to
some lucky assember reordering that give less icache/fetch-misses.
The early-drop improvement is below accuracy levels, can cannot be
trusted.
With GCC 4.9.1:
1. IP-forwarding: -10.05ns (faster) (+17532 pps)
2. Early-drop : -1.54ns (faster) (+16216 pps) below accuracy levels
I don't know what to conclude, as the result depend on the compiler
version... but these kind of change do affect performance, and should
be tested/measured.
- --
Best regards,
Jesper Dangaard Brouer
MSc.CS, Sr. Network Kernel Developer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
Quick eval of Pablo's refactor of __netif_receive_skb_core
==
:Version: 0.1
:Author: Jesper Dangaard Brouer
Summary
===
Pablo is refactoring __netif_receive_skb_core() into several
functions, to allow for some other upcomming changes.
Daniel Borkmann question if this will affect performance.
Jesper tests this.
The performance effect (of this patch) depend on the Gcc compiler
version.
Two tests:
1. IP-forwarding (unloaded netfilter modules)
2. Early drop in iptables raw table
With GCC 4.4.7, which does not inline the new functions
(__netif_receive_skb_ingress and __netif_receive_skb_finish) the
performance impact/regression is definitly measurable.
With GCC 4.4.7:
1. IP-forwarding: +25.18 ns (slower) (-27776 pps)
2. Early-drop : +7.55 ns (slower) (-66577 pps)
With GCC 4.9.1, the new functions gets inlined, thus the refactor
splitup of __netif_receive_skb_core() is basically cancled.
Strangly there is a small improvement for forwarding, likely due to
some lucky assember reordering that give less icache/fetch-misses.
The early-drop improvement is below accuracy levels, can cannot be
trusted.
With GCC 4.9.1:
1. IP-forwarding: -10.05ns (faster) (+17532 pps)
2. Early-drop : -1.54ns (faster) (+16216 pps) below accuracy levels
Setup
=
On host: ivy
Host ivy is the sink or DUT (Device Under Test).
* CPU E5-2695ES @ 2.80GHz
netfilter_unload_modules.sh
netfilter_unload_modules.sh
sudo rmmod nf_reject_ipv4 nf_reject_ipv6
base_device_setup.sh eth4 # 10G sink/receiving interface (ixgbe)
base_device_setup.sh eth5
sudo ethtool --coalesce eth4 rx-usecs 30
Make a fake route to 198.18.0.0/15 out via eth5
sudo ip neigh add 192.168.21.66 dev eth5 lladdr 00:00:ba:d0:ba:d0
sudo ip route add 198.18.0.0/15 via 192.168.21.66 dev eth5
Disable power saving to get more accurate measurements (see blogpost)::
$ sudo tuned-adm active
Current active profile: latency-performance
Service tuned: enabled, running
Service ktune: enabled, running
Early drop in raw
-
alias iptables='sudo iptables'
iptables -t raw -N simple || iptables -t raw -F simple
iptables -t raw -I simple -d 198.18.0.0/15 -j DROP
iptables -t raw -D PREROUTING -j simple
iptables -t raw -I PREROUTING -j simple
On host: dragon
---
Host dragon is the packet generator.
* 2x CPU E5-2630 0 @ 2.30GHz
Generator NIC: eth8 - ixgbe 10G
netfilter_unload_modules.sh
netfilter_unload_modules.sh
sudo rmmod nf_reject_ipv4 nf_reject_ipv6
base_device_setup.sh eth8 # 10G generator interface