Re: [PATCH ipsec-next 00/11] xfrm: policy: add inexact policy search tree

2018-11-13 Thread Steffen Klassert
On Thu, Nov 08, 2018 at 07:00:14PM -0800, David Miller wrote:
> From: Florian Westphal 
> Date: Wed,  7 Nov 2018 23:00:30 +0100
> 
> > This series attempts to improve xfrm policy lookup performance when
> > a lot of (several hundred or even thousands) inexact policies exist
> > on a system.
> > 
> > On insert, a policy is either placed in hash table (all direct (/32 for
> > ipv4, /128 policies, or all policies matching a user-configured threshold).
> > All other policies get inserted into inexact list as per priority.
> > 
> > Lookup then scans inexact list for first matching entry.
> > 
> > This series instead makes it so that inexact policy is added to exactly
> > one of four different search list classes.
> > 
> > 1. "Any:Any" list, containing policies where both saddr and daddr are
> >wildcards or have very coarse prefixes, e.g. 10.0.0.0/8 and the like.
> > 2. "saddr:any" list, containing policies with a fixed saddr/prefixlen,
> >but without destination restrictions.
> >These lists are stored in rbtree nodes; each node contains those
> >policies matching saddr/prefixlen.
> > 3. "Any:daddr" list. Similar to 2), except for policies where only the
> >destinations are specified.
> > 4. "saddr:daddr" lists, containing policies that match the given
> >source/destination network.
> > 
> >The root of the saddr/daddr tree is stored in the nodes of the
> >'daddr' tree.
> ...
> > Comments or questions welcome.
> 
> Acked-by: David S. Miller 

This is now applied to ipsec-next, thanks a lot
for your work Florian!


Re: [PATCH ipsec-next 00/11] xfrm: policy: add inexact policy search tree

2018-11-08 Thread David Miller
From: Florian Westphal 
Date: Wed,  7 Nov 2018 23:00:30 +0100

> This series attempts to improve xfrm policy lookup performance when
> a lot of (several hundred or even thousands) inexact policies exist
> on a system.
> 
> On insert, a policy is either placed in hash table (all direct (/32 for
> ipv4, /128 policies, or all policies matching a user-configured threshold).
> All other policies get inserted into inexact list as per priority.
> 
> Lookup then scans inexact list for first matching entry.
> 
> This series instead makes it so that inexact policy is added to exactly
> one of four different search list classes.
> 
> 1. "Any:Any" list, containing policies where both saddr and daddr are
>wildcards or have very coarse prefixes, e.g. 10.0.0.0/8 and the like.
> 2. "saddr:any" list, containing policies with a fixed saddr/prefixlen,
>but without destination restrictions.
>These lists are stored in rbtree nodes; each node contains those
>policies matching saddr/prefixlen.
> 3. "Any:daddr" list. Similar to 2), except for policies where only the
>destinations are specified.
> 4. "saddr:daddr" lists, containing policies that match the given
>source/destination network.
> 
>The root of the saddr/daddr tree is stored in the nodes of the
>'daddr' tree.
...
> Comments or questions welcome.

Acked-by: David S. Miller 

This looks really great.  Nice work Florian.


[PATCH ipsec-next 00/11] xfrm: policy: add inexact policy search tree

2018-11-07 Thread Florian Westphal
This series attempts to improve xfrm policy lookup performance when
a lot of (several hundred or even thousands) inexact policies exist
on a system.

On insert, a policy is either placed in hash table (all direct (/32 for
ipv4, /128 policies, or all policies matching a user-configured threshold).
All other policies get inserted into inexact list as per priority.

Lookup then scans inexact list for first matching entry.

This series instead makes it so that inexact policy is added to exactly
one of four different search list classes.

1. "Any:Any" list, containing policies where both saddr and daddr are
   wildcards or have very coarse prefixes, e.g. 10.0.0.0/8 and the like.
2. "saddr:any" list, containing policies with a fixed saddr/prefixlen,
   but without destination restrictions.
   These lists are stored in rbtree nodes; each node contains those
   policies matching saddr/prefixlen.
3. "Any:daddr" list. Similar to 2), except for policies where only the
   destinations are specified.
4. "saddr:daddr" lists, containing policies that match the given
   source/destination network.

   The root of the saddr/daddr tree is stored in the nodes of the
   'daddr' tree.

This diagram illustrates the list classes, and their
placement in the lookup hierarchy:

xfrm_pol_inexact_bin = hash(dir,type,family,if_id);
 |
 + root_d: sorted by daddr:prefix
 | |
 |xfrm_pol_inexact_node
 | |
 | +- root: sorted by saddr/prefix
 | |  |
 | | xfrm_pol_inexact_node
 | |  |
 | |  + root: unused
 | |  |
 | |  + hhead: saddr:daddr policies
 | |
 | +- coarse policies and all any:daddr policies
 |
 + root_s: sorted by saddr:prefix
 | |
 |xfrm_pol_inexact_node
 | |
 | + root: unused
 | |
 | + hhead: saddr:any policies
 |
 + coarse policies and all any:any policies

First we obtain inexact bin by hashing direction, family and other 'must
match' criteria.

Next, we search root_d using packets' destination ip address.
If we find a matching node, we search that nodes' source address tree
using packets src address.

We also search the bins root_s tree using packets source address.

This initial lookup now has created a 'candidate set' consisting
of up to four hlist heads to the four search classes.

Each list gets scanned for first matching entry.
Out of those up the one with the highest priority is chosen.
In case several policies had same priority, most recent one is used to
match old behaviour.

Locking rules:
insert requires net->xfrm.xfrm_policy_lock spinlock held + BH off
(no change to current kernel).
Insert/removal of a tree node needs increment of its sequence count.

lookup requires rcu read lock, lookups in a tree need to sample
the sequence count of the bin that tree is located in as well and
re-try in case it has changed.

Results on my kvm test setup, using 4 netns:

# 1.2   1.1   3.1  3.102.1   2.2
# eth1  eth1 veth0 veth0 eth1   eth1
# ns1  ns3 - ns4  ns2

ns3 and ns4 are connected via ipsec tunnel.
'Dummy policies' below means that i created 1k non-matching
policies in both ns3 and ns4.

20 parallel netperf (mbits, unidirectional TCP_STREAM):
normal:
993.518000 (no dummy policies)
796.32 (with dummy policies)

patched:
991.335 (no dummy policies)
989.684 (with dummy policies)

20 parallel TCP_RR, trans. per second:
normal:
32413.93 (no dummy policies)
17725.97 (with dummy policies)

patched:
32314.51 (no dummy policies)
32326.19 (with dummy policies)

caveats:

1. Won't work unless large part of policies have no common saddr/daddr pairs.
   Extreme example: adding 10k 0.0.0.0/0 -> 0.0.0.0/0
   policies may require search of all policies, just like current kernel.

2. The policy add sequence
 a daddr 10.0.0.0/26
 b daddr 10.0.0.64/26
 c daddr 10.0.0.0/24

When c gets added, search nodes for a and b have to be merged with
the new subnet, as it contains both.

This also means that a single policy rule with a small subnet value
can cause severe tree degradation and thus longer search lists.

3. Causes (small) performance degradation when setup only has few policies.
4. Policies need an additional hlist, as MIGRATE doesn't consider if_id,
   but the patchset makes the if_id part of the initial hash lookup.
   Could be avoided by not considering if_id as part of initial bin lookup,
   but then increases list length a lot when we have lots of xfrm interfaces.
5. In oder to solve 'two matching candidates have same priority, which
   takes precedence?' problem I added a number to xfrm_policy struct that
   maps to its index in the complete inexact list.
   This was necessary to make sure same sequence of policy-add commands
   keeps on returning same lookup