On Tue, Apr 10, 2007 at 10:41:04PM +1200, Mark Kirkwood wrote: > Kris Kennaway wrote: > >If so, then your task is the following: > > > >Make SYSV semaphores less dumb about process wakeups. Currently > >whenever the semaphore state changes, all processes sleeping on the > >semaphore are woken, even if we only have released enough resources > >for one waiting process to claim. i.e. there is a thundering herd > >wakeup situation which destroys performance at high loads. Fixing > >this will involve replacing the wakeup() calls with appropriate > >amounts of wakeup_one(). > > I'm forwarding this to the pgsql-hackers list so that folks more > qualified than I can comment, but as I understand the way postgres > implements locking each process has it *own* semaphore it waits on - > and who is waiting for what is controlled by an in (shared) memory hash > of lock structs (access to these is controlled via platform Dependant > spinlock code). So a given semaphore state change should only involve > one process wakeup.
I have not studied the exact code path, but there are indeed multiple wakeups happening from the semaphore code (as many as the number of active postgresql processes). It is easy to instrument sleepq_broadcast() and log them when they happen. Anyway mux@ fixed this some time ago, which indeed helped scaling for traffic over a local domain socket (particularly at higher loads), but I saw some anomalous results when using loopback TCP traffic. I think this is unrelated (in this situation TCP is highly contended, and it is often the case that fixing one bottleneck can make a highly contended situation perform worse, because you were effectively serializing a bit before, and reducing the non-linear behaviour) but am still investigating, so the patch has not yet been committed. Kris
Description: PGP signature