> Tatsuo Ishii <[EMAIL PROTECTED]> writes:
> > In my understanding the deadlock check is performed every time the
> > backend aquires lock. Once the it aquires, it kill the timer. However,
> > under heavy transactions such as pgbench generates, chances are that
> > the checking fires, and it tries to aquire a spin lock. That seems the
> > situation.
> 
> It could be that with ~1000 backends all waiting for the same lock, the
> deadlock-checking code just plain takes too long to run.  It might have
> an O(N^2) or worse behavior in the length of the queue; I don't think
> the code was ever analyzed for such problems.
> 
> Do you want to try adding some instrumentation to HandleDeadlock to see
> how long it runs on each call?

I added some codes into HandleDeadLock to measure how long
LockLockTable and DeadLOckCheck calls take. Followings are the result
in running pgbench -c 1000 (it failed with stuck spin lock
error). "real time" shows how long they actually run (using
gettimeofday). "user time" and "system time" are measured by calling
getrusage. The time unit is milli second.

 LockLockTable: real time

 min |  max   |        avg        
-----+--------+-------------------
   0 | 867873 | 152874.9015151515

 LockLockTable: user time

 min | max |     avg      
-----+-----+--------------
   0 |  30 | 1.2121212121

 LockLockTable: system time

 min | max  |      avg       
-----+------+----------------
   0 | 2140 | 366.5909090909


 DeadLockCheck: real time

 min |  max  |       avg       
-----+-------+-----------------
   0 | 87671 | 3463.6996197719

 DeadLockCheck: user time

 min | max |      avg      
-----+-----+---------------
   0 | 330 | 14.2205323194

 DeadLockCheck: system time

 min | max |     avg      
-----+-----+--------------
   0 | 100 | 2.5095057034

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Reply via email to