On Mon, 18 Mar 2002, Patrick Schaaf wrote:

> Hello Martin,
> 
> thanks for your numbers.
> 
> > with 16384 hashbuckets and a maximum of 131072 tracked connections it took
> > 7.5 seconds to perform 1 million lookups in the hashtable (using
> > __ip_conntrack_find from userspace).
> 
> Was this 1 million lookups to a random hash bucket each, with guaranteed
> "no match"? Assuming this in the following...

yes that was a lookup of a connection not present in the conntrack.

> That's 8 connections per bucket, resulting in 8 million "pointer chasings"
> during table lookup, i.e. about 937ns per chasing.

and compare of tuple...

> > with the same number of hashbuckets (16384) but with a maximum of 262144
> > tracked connections (131072 * 2) it took just over 12 seconds to perform
> > the same test (1 million lookups).
> 
> Now you have an average chain length of 16, for 16 million chasings total,
> and 750ns per chasing - probably a bit better (per chasing, not in reality)
> because the hash array cache line load amortizes over the list length.
>
> > and then I doubled the hashsize to 32768 and left the maximum number of
> > tracked connections at 262144 and performed the test again.
> > 1 million lookups took 7.5 seconds again despite that I was tracking twice
> > as many connections as before.
> 
> You are back again at an average chain length of 8, so this is not surprising.

I know, it was just to demonstrate.

> For nothing but pointer chasing, I'd expect about 250ns (main memory RTT)
> for a single chasing, so your numbers indicate a pretty large "testing
> overhead". Maybe a different, non-userlevel testing method would be better.

This is from userspace and via netlink so I'm sure the overhead in these
tests are a lot higher than the lookup for actual packets in the kernel.

the codepath is something like this:

userspace: program -> libctnetlink
->
kernel: netlink -> ctnetlink -> __ip_conntrack_find

and then the way back with a response so I assume the overhead is
higher then the actual lookups.

In my program I have a memset() in the loop and ctnetlink allocated a new
skb to send back as a result to userspace.


I could probably write a small kernelmodule that could perform the lookups
and measure the time for a large number of lookups.

I expect the lookups to be quite fast if the chain length isn't too long
as all we do is a hash of the tuple and then go do a linear search in the
hashbucket. Conntrack uses the standard LIST_FIND macro that's used all
over the kernel and that should be quite fast. 

/Martin

Never argue with an idiot. They drag you down to their level, then beat you with 
experience.


Reply via email to