Re: [HACKERS] LWLock deadlock and gdb advice

2015-08-05 Thread Andres Freund
On 2015-08-05 11:22:55 -0700, Jeff Janes wrote: > On Sun, Aug 2, 2015 at 8:05 AM, Andres Freund wrote: > > > On 2015-08-02 17:04:07 +0200, Andres Freund wrote: > > > I've attached a version of the patch that should address Heikki's > > > concern. It imo also improves the API and increases debugga

Re: [HACKERS] LWLock deadlock and gdb advice

2015-08-05 Thread Jeff Janes
On Sun, Aug 2, 2015 at 8:05 AM, Andres Freund wrote: > On 2015-08-02 17:04:07 +0200, Andres Freund wrote: > > I've attached a version of the patch that should address Heikki's > > concern. It imo also improves the API and increases debuggability by not > > having stale variable values in the vari

Re: [HACKERS] LWLock deadlock and gdb advice

2015-08-02 Thread Andres Freund
On 2015-08-02 12:33:06 -0400, Tom Lane wrote: > Andres Freund writes: > > I plan to commit the patch tomorrow, so it's included in alpha2. > > Please try to commit anything you want in alpha2 *today*. I'd > prefer to see some successful buildfarm cycles on such patches > before we wrap. Ok, wil

Re: [HACKERS] LWLock deadlock and gdb advice

2015-08-02 Thread Tom Lane
Andres Freund writes: > I plan to commit the patch tomorrow, so it's included in alpha2. Please try to commit anything you want in alpha2 *today*. I'd prefer to see some successful buildfarm cycles on such patches before we wrap. regards, tom lane -- Sent via pgsql-ha

Re: [HACKERS] LWLock deadlock and gdb advice

2015-08-02 Thread Andres Freund
Hi Jeff, Heikki, On 2015-07-31 09:48:28 -0700, Jeff Janes wrote: > I had run it for 24 hours, while it usually took less than 8 hours to look > up before. I did see it within a few minutes one time when I checked out a > new HEAD and then forgot to re-apply your or Heikki's patch. > > But now I'

Re: [HACKERS] LWLock deadlock and gdb advice

2015-08-02 Thread Andres Freund
On 2015-08-02 17:04:07 +0200, Andres Freund wrote: > I've attached a version of the patch that should address Heikki's > concern. It imo also improves the API and increases debuggability by not > having stale variable values in the variables anymore. (also attached is > a minor optimization that He

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-31 Thread Jeff Janes
On Thu, Jul 30, 2015 at 8:22 PM, Andres Freund wrote: > On 2015-07-30 09:03:01 -0700, Jeff Janes wrote: > > On Wed, Jul 29, 2015 at 6:10 AM, Andres Freund > wrote: > > > What do you think about something roughly like the attached? > > > > > > > I've not evaluated the code, but applying it does s

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-31 Thread Heikki Linnakangas
On 07/30/2015 09:14 PM, Andres Freund wrote: On 2015-07-30 17:36:52 +0300, Heikki Linnakangas wrote: In 9.4, LWLockAcquire holds the spinlock when it marks the lock as held, until it has updated the variable. And LWLockWaitForVar() holds the spinlock when it checks that the lock is held and that

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-30 Thread Andres Freund
On 2015-07-30 09:03:01 -0700, Jeff Janes wrote: > On Wed, Jul 29, 2015 at 6:10 AM, Andres Freund wrote: > > What do you think about something roughly like the attached? > > > > I've not evaluated the code, but applying it does solve the problem I was > seeing. Cool, thanks for testing! How long

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-30 Thread Andres Freund
On 2015-07-30 17:36:52 +0300, Heikki Linnakangas wrote: > In 9.4, LWLockAcquire holds the spinlock when it marks the lock as held, > until it has updated the variable. And LWLockWaitForVar() holds the spinlock > when it checks that the lock is held and that the variable's value matches. > So it can

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-30 Thread Jeff Janes
On Wed, Jul 29, 2015 at 6:10 AM, Andres Freund wrote: > On 2015-07-29 14:22:23 +0200, Andres Freund wrote: > > On 2015-07-29 15:14:23 +0300, Heikki Linnakangas wrote: > > > Ah, ok, that should work, as long as you also re-check the variable's > value > > > after queueing. Want to write the patch,

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-30 Thread Heikki Linnakangas
On 07/29/2015 09:35 PM, Andres Freund wrote: On 2015-07-29 20:23:24 +0300, Heikki Linnakangas wrote: Backend A has called LWLockWaitForVar(X) on a lock, and is now waiting on it. The lock holder releases the lock, and wakes up A. But before A wakes up and sees that the lock is free, another back

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-29 Thread Andres Freund
On 2015-07-29 20:23:24 +0300, Heikki Linnakangas wrote: > Backend A has called LWLockWaitForVar(X) on a lock, and is now waiting on > it. The lock holder releases the lock, and wakes up A. But before A wakes up > and sees that the lock is free, another backend acquires the lock again. It > runs LWL

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-29 Thread Heikki Linnakangas
On 07/29/2015 04:10 PM, Andres Freund wrote: On 2015-07-29 14:22:23 +0200, Andres Freund wrote: On 2015-07-29 15:14:23 +0300, Heikki Linnakangas wrote: Ah, ok, that should work, as long as you also re-check the variable's value after queueing. Want to write the patch, or should I? I'll try. S

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-29 Thread Jeff Janes
On Wed, Jul 29, 2015 at 9:26 AM, Andres Freund wrote: > On 2015-07-29 09:23:32 -0700, Jeff Janes wrote: > > On Tue, Jul 28, 2015 at 9:06 AM, Jeff Janes > wrote: > > I've reproduced it again against commit b2ed8edeecd715c8a23ae462. > > > > It took 5 hours on a 8 core "Intel(R) Xeon(R) CPU E5-2650

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-29 Thread Andres Freund
On 2015-07-29 09:23:32 -0700, Jeff Janes wrote: > On Tue, Jul 28, 2015 at 9:06 AM, Jeff Janes wrote: > I've reproduced it again against commit b2ed8edeecd715c8a23ae462. > > It took 5 hours on a 8 core "Intel(R) Xeon(R) CPU E5-2650". > > I also reproduced it in 3 hours on the same machine with bo

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-29 Thread Jeff Janes
On Tue, Jul 28, 2015 at 9:06 AM, Jeff Janes wrote: > On Tue, Jul 28, 2015 at 7:06 AM, Andres Freund wrote: > >> Hi, >> >> On 2015-07-19 11:49:14 -0700, Jeff Janes wrote: >> > After applying this patch to commit fdf28853ae6a397497b79f, it has >> survived >> > testing long enough to convince that

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-29 Thread Andres Freund
On 2015-07-29 14:22:23 +0200, Andres Freund wrote: > On 2015-07-29 15:14:23 +0300, Heikki Linnakangas wrote: > > Ah, ok, that should work, as long as you also re-check the variable's value > > after queueing. Want to write the patch, or should I? > > I'll try. Shouldn't be too hard. What do you t

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-29 Thread Andres Freund
On 2015-07-29 15:14:23 +0300, Heikki Linnakangas wrote: > Ah, ok, that should work, as long as you also re-check the variable's value > after queueing. Want to write the patch, or should I? I'll try. Shouldn't be too hard. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-29 Thread Heikki Linnakangas
On 07/29/2015 03:08 PM, Andres Freund wrote: On 2015-07-29 14:55:54 +0300, Heikki Linnakangas wrote: On 07/29/2015 02:39 PM, Andres Freund wrote: In an earlier email you say: After the spinlock is released above, but before the LWLockQueueSelf() call, it's possible that another backend comes i

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-29 Thread Andres Freund
On 2015-07-29 14:55:54 +0300, Heikki Linnakangas wrote: > On 07/29/2015 02:39 PM, Andres Freund wrote: > >In an earlier email you say: > >>After the spinlock is released above, but before the LWLockQueueSelf() call, > >>it's possible that another backend comes in, acquires the lock, changes the > >

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-29 Thread Heikki Linnakangas
On 07/29/2015 02:39 PM, Andres Freund wrote: On 2015-07-15 18:44:03 +0300, Heikki Linnakangas wrote: Previously, LWLockAcquireWithVar set the variable associated with the lock atomically with acquiring it. Before the lwlock-scalability changes, that was straightforward because you held the spinl

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-29 Thread Andres Freund
Hi, Finally getting to this. On 2015-07-15 18:44:03 +0300, Heikki Linnakangas wrote: > Previously, LWLockAcquireWithVar set the variable associated with the lock > atomically with acquiring it. Before the lwlock-scalability changes, that > was straightforward because you held the spinlock anyway,

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-28 Thread Jeff Janes
On Tue, Jul 28, 2015 at 7:06 AM, Andres Freund wrote: > Hi, > > On 2015-07-19 11:49:14 -0700, Jeff Janes wrote: > > After applying this patch to commit fdf28853ae6a397497b79f, it has > survived > > testing long enough to convince that this fixes the problem. > > What was the actual workload break

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-28 Thread Andres Freund
Hi, On 2015-07-19 11:49:14 -0700, Jeff Janes wrote: > After applying this patch to commit fdf28853ae6a397497b79f, it has survived > testing long enough to convince that this fixes the problem. What was the actual workload breaking with the bug? I ran a small variety and I couldn't reproduce it ye

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-26 Thread Amit Langote
On 2015-07-16 PM 04:03, Jeff Janes wrote: > On Wed, Jul 15, 2015 at 8:44 AM, Heikki Linnakangas wrote: > >> >> Both. Here's the patch. >> >> Previously, LWLockAcquireWithVar set the variable associated with the lock >> atomically with acquiring it. Before the lwlock-scalability changes, that >> w

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-19 Thread Jeff Janes
On Thu, Jul 16, 2015 at 12:03 AM, Jeff Janes wrote: > On Wed, Jul 15, 2015 at 8:44 AM, Heikki Linnakangas > wrote: > >> >> Both. Here's the patch. >> >> Previously, LWLockAcquireWithVar set the variable associated with the >> lock atomically with acquiring it. Before the lwlock-scalability chang

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-19 Thread Jeff Janes
On Wed, Jul 15, 2015 at 8:44 AM, Heikki Linnakangas wrote: > On 06/30/2015 11:24 PM, Andres Freund wrote: > >> On 2015-06-30 22:19:02 +0300, Heikki Linnakangas wrote: >> >>> Hm. Right. A recheck of the value after the queuing should be sufficient to fix? That's how we deal with the exact sam

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-16 Thread Jeff Janes
On Wed, Jul 15, 2015 at 8:44 AM, Heikki Linnakangas wrote: > > Both. Here's the patch. > > Previously, LWLockAcquireWithVar set the variable associated with the lock > atomically with acquiring it. Before the lwlock-scalability changes, that > was straightforward because you held the spinlock any

Re: [HACKERS] LWLock deadlock and gdb advice

2015-07-15 Thread Heikki Linnakangas
On 06/30/2015 11:24 PM, Andres Freund wrote: On 2015-06-30 22:19:02 +0300, Heikki Linnakangas wrote: Hm. Right. A recheck of the value after the queuing should be sufficient to fix? That's how we deal with the exact same scenarios for the normal lock state, so that shouldn't be very hard to add.

Re: [HACKERS] LWLock deadlock and gdb advice

2015-06-30 Thread Andres Freund
On 2015-06-30 22:19:02 +0300, Heikki Linnakangas wrote: > >Hm. Right. A recheck of the value after the queuing should be sufficient > >to fix? That's how we deal with the exact same scenarios for the normal > >lock state, so that shouldn't be very hard to add. > > Yeah. It's probably more efficien

Re: [HACKERS] LWLock deadlock and gdb advice

2015-06-30 Thread Heikki Linnakangas
On 06/30/2015 10:09 PM, Andres Freund wrote: On 2015-06-30 21:08:53 +0300, Heikki Linnakangas wrote: /* * XXX: We can significantly optimize this on platforms with 64bit * atomics. */

Re: [HACKERS] LWLock deadlock and gdb advice

2015-06-30 Thread Andres Freund
On 2015-06-30 21:08:53 +0300, Heikki Linnakangas wrote: > > /* > > * XXX: We can significantly optimize this on platforms > > with 64bit > > * atomics. > > */ > > value = *valptr; > >

Re: [HACKERS] LWLock deadlock and gdb advice

2015-06-30 Thread Heikki Linnakangas
On 06/30/2015 07:37 PM, Alvaro Herrera wrote: Jeff Janes wrote: I've gotten the LWLock deadlock again. User backend 24841 holds the WALInsertLocks 7 and is blocked attempting to acquire 6 . So it seems to be violating the lock ordering rules (although I don't see that rule spelled out in xlog

Re: [HACKERS] LWLock deadlock and gdb advice

2015-06-30 Thread Heikki Linnakangas
On 06/30/2015 07:05 PM, Jeff Janes wrote: On Mon, Jun 29, 2015 at 11:28 PM, Jeff Janes wrote: On Mon, Jun 29, 2015 at 5:55 PM, Peter Geoghegan wrote: On Mon, Jun 29, 2015 at 5:37 PM, Jeff Janes wrote: Is there a way to use gdb to figure out who holds the lock they are waiting for? Hav

Re: [HACKERS] LWLock deadlock and gdb advice

2015-06-30 Thread Alvaro Herrera
Jeff Janes wrote: > I've gotten the LWLock deadlock again. User backend 24841 holds the > WALInsertLocks 7 and is blocked attempting to acquire 6 . So it seems to > be violating the lock ordering rules (although I don't see that rule > spelled out in xlog.c) Hmm, interesting -- pg_stat_statemen

Re: [HACKERS] LWLock deadlock and gdb advice

2015-06-30 Thread Jeff Janes
On Mon, Jun 29, 2015 at 11:28 PM, Jeff Janes wrote: > On Mon, Jun 29, 2015 at 5:55 PM, Peter Geoghegan wrote: > >> On Mon, Jun 29, 2015 at 5:37 PM, Jeff Janes wrote: >> > Is there a way to use gdb to figure out who holds the lock they are >> waiting >> > for? >> >> Have you considered building

Re: [HACKERS] LWLock deadlock and gdb advice

2015-06-29 Thread Jeff Janes
On Mon, Jun 29, 2015 at 5:55 PM, Peter Geoghegan wrote: > On Mon, Jun 29, 2015 at 5:37 PM, Jeff Janes wrote: > > Is there a way to use gdb to figure out who holds the lock they are > waiting > > for? > > Have you considered building with LWLOCK_STATS defined, and LOCK_DEBUG > defined? That might

Re: [HACKERS] LWLock deadlock and gdb advice

2015-06-29 Thread Amit Kapila
On Tue, Jun 30, 2015 at 6:25 AM, Peter Geoghegan wrote: > > On Mon, Jun 29, 2015 at 5:37 PM, Jeff Janes wrote: > > Is there a way to use gdb to figure out who holds the lock they are waiting > > for? > > Have you considered building with LWLOCK_STATS defined, and LOCK_DEBUG > defined? That might

Re: [HACKERS] LWLock deadlock and gdb advice

2015-06-29 Thread Peter Geoghegan
On Mon, Jun 29, 2015 at 5:37 PM, Jeff Janes wrote: > Is there a way to use gdb to figure out who holds the lock they are waiting > for? Have you considered building with LWLOCK_STATS defined, and LOCK_DEBUG defined? That might do it. Otherwise, I suggest dereferencing the "l" argument to LWLockA