On Wed, 2010-12-22 at 13:50 +0000, Steven van der Vegt wrote:
> Hello list,
> 
> We are a ISP and happy Poundusers! But on one system we experience some 
> problems. For this system Pound is configured to use cookies for session 
> tracking. It is installed on a PF-Sense firewall (FreeBSD) which isn't a very 
> powerful device (Celeron M. 1Ghz 512Mb). We notice that the website seems to 
> stall every minute for 10 seconds. After looking through the Pound code I 
> notice the do_expire function, which is called every 60 seconds. It runs over 
> all sessions in the hash and checks if they are expired. If so, deletes them. 
> This takes almost 10 seconds and since the hashtable is locked no lookups and 
> inserts are permitted. There are ~3000 sessions in the table and Pound is 
> handling ~100 hits/s.
> Obviously this is a locking problem. The first thing that comes to mind is: 
> drop the non-threadsafe hashtable. But since this structure is a big part of 
> the code this is not a fast-to-implement solution. Maybe we can sharpen up 
> the lock-policy. Instead of locking the complete table during the t_expire, 
> maybe only lock it when actually deleting a node (in t_old). I studied the 
> code but I'm not sure if it is a (thread)safe solution. What do you think?
> I wonder, are we the only one experiencing these troubles? Is this a known 
> problem?
> 
> Kind regards,
> 
> Steven van der Vegt
> Echelon B.V. The Netherlands
> 
> --
> To unsubscribe send an email with subject unsubscribe to [email protected].
> Please contact [email protected] for questions.

Sorry for delaying this, but it took a while to find a slow machine.

We did some timing tests, and it looks very unlikely that the hash table
slows you down that much. On an old Sempron (1GHz), with a hash table
with 10000 entries, deleting 5000 of them takes under 1 second.

I think you may want to look at other possible culprits:
- within Pound you have a thread that generates the ephemeral RSA keys.
If it is configured to run every minute (parameter t_rsa in the autoconf
script) you may see that kind of delay
- otherwise check what other jobs are running, you may have something
completely unrelated using up CPU
- finally check on memory usage, as active threads can eat up a lot of
it, which in turn would lead to swapping.

Let us know what you find out.
-- 
Robert Segall
Apsis GmbH
Postfach, Uetikon am See, CH-8707
Tel: +41-32-512 30 19


--
To unsubscribe send an email with subject unsubscribe to [email protected].
Please contact [email protected] for questions.

Reply via email to