On Mon, 18 Mar 2002, Aviv Bergman wrote:

> Hi all,
> 
> I'm working on a proxy type program, using REDIRECT to catch (tcp) traffic,
> and I'm seeing severe network degradation above ~2000 connection.
> 
> (computer: 1Gb p3, 2Gb memory, kernel 2.4.18 + aa1 patch)
> 
> I've profiled the kernel and found that > 50% of the cpu time is in
> __ip_conntrack_find - is there a patch to make connection tracking use a
> more scalable data structure  (as I understand it uses a list), or to
> improve it's performance?

Then you must have long hashchains in the hashtable.
But that sounds weird as 2000 connections isn't much and the default
hashsize (number of hashbuckets) should be fairly large if you have 2GB
memory.

one thing that I've noticed here is that even though I only have ~400
clients behind one router it has 72.000 entries in the connectiontracking.
This is probably because of portscans and other stuff.

when you load the ip_conntrack module it should print out the number of
hashbuckets and the maximum number of connections that can be tracked
(that value is hashsize * 8 IIRC)

you can set it to a specific value:

modprobe ip_conntrack hashsize=16384

that will give you 16384 hashbuckets and a total of 131072 tracked
connections.

I did a small test when I was bugfixing ctnetlink (now it works but it
still has a small SMP race) to see how long it took to do 1 million 
lookups of a connections that wasn't present in the hashtable with
diffrent sizes of hashbuckets and total number of tracked connections.
Tests were performed on a pIII 700 with 704MB ram.
I filled the hashtable with a lot of connections by using a
packetgenerator that sends tcp requests to random ip's.


with 16384 hashbuckets and a maximum of 131072 tracked connections it took
7.5 seconds to perform 1 million lookups in the hashtable (using
__ip_conntrack_find from userspace).

with the same number of hashbuckets (16384) but with a maximum of 262144
tracked connections (131072 * 2) it took just over 12 seconds to perform
the same test (1 million lookups).

and then I doubled the hashsize to 32768 and left the maximum number of
tracked connections at 262144 and performed the test again.
1 million lookups took 7.5 seconds again despite that I was tracking twice
as many connections as before.

I havn't performed any more speed tests than this but thanks to this mail
I got the idea to perform a lot of tests and generate a small report on
how it scales with a lot of connections.


I hope this mail helps in fixing your performance problems. Please mail
back with a report if it helped and if so how much it helped. And ofcourse
mail and bug us if it still doesn't work.

Maybe it's time to introduce a new configurable option in the
kernelconfiguration so people can select diffrent default hashsizes (right
now the number of buckets is 1/16384 of the memory amount in the machine)

/Martin

Never argue with an idiot. They drag you down to their level, then beat you with 
experience.


Reply via email to