Re: [Pound Mailing List] RE: Website stalls every 60 seconds

Joe Gooch Fri, 31 Dec 2010 11:49:46 -0800

I was looking forward to hearing your thoughts!

On Dec 31, 2010, at 4:27 AM, "Robert Segall" <[email protected]> wrote:

> On Thu, 2010-12-30 at 16:17 +0000, Steven van der Vegt wrote:
>> Today I finished rewriting the hashtable locking logic. I've added the diff 
>> below.
>
> Thanks for the patch, we'll have a closer look at it. A little problem
> to start with: the reason we avoided thread locks is the threads library
> implementation. If you look at the man page, you'll see that, depending
> on scheduler support, it is not guaranteed that a write lock will get
> priority over read locks. From the man page of pthread_rwlock_rdlock(3):
>
>       If the Thread Execution Scheduling option is supported, and the
>       threads involved in  the  lock  are  executing  with  the
>       scheduling  policies SCHED_FIFO  or  SCHED_RR, the calling thread
>       shall not acquire the lock if a writer holds the lock or if
>       writers of higher  or  equal  priority are  blocked  on  the
>       lock; otherwise, the calling thread shall acquire the lock.
>
>       If the Threads  Execution  Scheduling  option  is  supported,
>       and  the threads  involved  in  the  lock  are executing with the
>       SCHED_SPORADIC scheduling policy, the calling thread shall not
>       acquire the lock  if  a writer  holds  the  lock  or if writers
>       of higher or equal priority are blocked on the lock; otherwise,
>       the calling thread  shall  acquire  the lock.
>
> And also:
>
>       With a large number of readers, and relatively few writers, there
>       is the possibility of writer starvation. If there are threads
>       waiting for an exclusive write lock on the read/write lock and
>       there are threads that currently hold a shared read lock, the
>       shared read lock request will be granted.
>
> That means that under heavy load the thread waiting for the write lock
> may never acquire it, as at any given time you may have a bunch of read
> locks active (multiple threads may acquire the read lock). I am not
> entirely sure how to solve this.

So what I'm hearing/reading... The implementation isn't a fair scheduler.. So 
if 5 readers are in place, and a writer requests a lock, there's no guarantee 
the writer will get the lock when those readers are finished... If a constant 
stream of readers are coming in the write lock may never work.  While that's 
fine for the session reaper that's not good for the insert in a new session 
creation request...

I don't like many of the solutions that come to mind.
1) Wite our own fair/prioritized locking w/ linked lists or queues and use 
mutexes and condition variables to implement.
2) Find a C library that implements the semantics we need... Whether that's a 
hashtable or locking library depends on what we could find.
3) Write our own hashtable... For instance, mutex lock the fixed array, release 
once we find the collision chain... Have a mutex for each collision chain.  
Then t_expire can lock by collision chain and limits the requests that are 
blocking at any given time.
4). Use better hardware :)
5)  Use distributed session tracking instead of tracking it in Pound....

For 5... Other load balancing appliances use a different approach...  The load 
balancer itself generates a cookie to tag the sessions.  ( I wouldn't even 
consider param/url rewrites ). Semantics would be like this:
A). In the config pick a cookie name, domain, expiration for the 
cookie,httponly flags etc.
B). On each request, look for our cookie.  If it exists, use it to determine 
the backend to use.  Otherwise choose one randomly.  If the backend is down 
choose another.
C). Add a set cookie header to every request that sets or resets our cookie. 
(thus updating the expiration time)

Using this method no database is necessary in pound.  The browser tells the 
load balancer where it belongs.  This has other cool benefits like you can kill 
and restart pound and other than the connectivity interruption, session 
affinity would be preserved.  And no concurrency issues because each request is 
autonomous.

Implicit in this is the ability to uniquely identify each backend... Either 
with a key inn the backend section in the config, or a hex representation of 
the sin_addr and port...

Might need to have this as an option so url/param LB is still an option.

>
>> This array is filled (with a read lock) through the do_all lhash function 
>> which calls the t_old function.
>> If the array is full the do_all method can't be stopped and thus t_old just 
>> returns. After do_all is finished, the items are deleted (with a write lock).
>> There are a few things that come to mind with this solution. The size of the 
>> array is fixed. What if every minute more items are added than deleted? I 
>> suggest that we make this size variable. Either in the config or calculate 
>> it.
>>
>> Also I found a bug in the Makefile. I edited the pound.h file but the 
>> poundctl binairy didn't rebuild. I think there's something wrong with the 
>> dependencies.
>
> I would check the system date - we never ran into this problem.

I have before but with my mods I probably just added the pound.h dependency.  I 
can check when I get back in the office next week.

-G
--
To unsubscribe send an email with subject unsubscribe to [email protected].
Please contact [email protected] for questions.

Re: [Pound Mailing List] RE: Website stalls every 60 seconds

Reply via email to