Re: [HACKERS] Adjustment of spinlock sleep delays

Mike Mascari Thu, 14 Aug 2003 12:18:06 -0700

Tom Lane wrote:

> I've been thinking about Ludwig Lim's recent report of a "stuck
> spinlock" failure on a heavily loaded machine.  Although I originally
> found this hard to believe, there is a scenario which makes it
> plausible.  Suppose that we have a bunch of recently-started backends
> as well as one or more that have been running a long time --- long
> enough that the scheduler has niced them down a priority level or two.
> Now suppose that one of the old-timers gets interrupted while holding
> a spinlock (an event of small but nonzero probability), and that before
> it can get scheduled again, several of the newer, higher-priority
> backends all start trying to acquire the same spinlock.  The "acquire"
> code looks like "try to grab the spinlock a few times, then sleep for
> 10 msec, then try again; give up after 1 minute".  If there are enough
> backends trying this that cycling through all of them takes at least
> 10 msec, then the lower-priority backend will never get scheduled, and
> after a minute we get the dreaded "stuck spinlock".
> 
> To forestall this scenario, I'm thinking of introducing backoff into the
> sleep intervals --- that is, after first failure to get the spinlock,
> sleep 10 msec; after the second, sleep 20 msec, then 40, etc, with a
> maximum sleep time of maybe a second.  The number of iterations would be
> reduced so that we still time out after a minute's total delay.
> 
> Comments?


Should there be any correlation between the manner by which the
backoff occurs and the number of active backends?

Mike Mascari
[EMAIL PROTECTED]





---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Adjustment of spinlock sleep delays

Reply via email to