On Mon, Jun 25, 2018 at 05:26:02PM +0200, Florian Westphal wrote: > Kristian Evensen says: > In a project I am involved in, we are running ipsec (Strongswan) on > different mt7621-based routers. Each router is configured as an > initiator and has around ~30 tunnels to different responders (running > on misc. devices). Before the flow cache was removed (kernel 4.9), we > got a combined throughput of around 70Mbit/s for all tunnels on one > router. However, we recently switched to kernel 4.14 (4.14.48), and > the total throughput is somewhere around 57Mbit/s (best-case). I.e., a > drop of around 20%. Reverting the flow cache removal restores, as > expected, performance levels to that of kernel 4.9. > > When pcpu xdst exists, it has to be validated first before it can be > used. > > A negative hit thus increases cost vs. no-cache. > > As number of tunnels increases, hit rate decreases so this pcpu caching > isn't a viable strategy. > > Furthermore, the xdst cache also needs to run with BH off, so when > removing this the bh disable/enable pairs can be removed too. > > Kristian tested a 4.14.y backport of this change and reported > increased performance: > > In our tests, the throughput reduction has been reduced from around -20% > to -5%. We also see that the overall throughput is independent of the > number of tunnels, while before the throughput was reduced as the number > of tunnels increased. > > Reported-by: Kristian Evensen <kristian.even...@gmail.com> > Signed-off-by: Florian Westphal <f...@strlen.de>
Applied to ipsec-next, thanks a lot!