Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-08 Thread Tom Lane
daveg writes: > On Wed, Sep 07, 2011 at 09:02:04PM -0400, Tom Lane wrote: >> daveg writes: >>> The first version we saw it on was 8.4.7. >> Yeah, you said that. I was wondering what you'd last run before 8.4.7. > Sorry, misunderstood. We were previously running 8.4.4, but have been on 8.4.7 >

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Wed, Sep 07, 2011 at 09:02:04PM -0400, Tom Lane wrote: > daveg writes: > > On Wed, Sep 07, 2011 at 07:39:15PM -0400, Tom Lane wrote: > >> BTW ... what were the last versions you were running on which you had > >> *not* seen the problem? (Just wondering about the possibility that we > >> back-p

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Robert Haas
On Wed, Sep 7, 2011 at 6:25 PM, Tom Lane wrote: > Robert Haas writes: >> I thought about an error exit from client authentication, and that's a >> somewhat appealing explanation, but I can't quite see why we wouldn't >> clean up there the same as anywhere else.  The whole mechanism feels a >> bit

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Tom Lane
daveg writes: > On Wed, Sep 07, 2011 at 07:39:15PM -0400, Tom Lane wrote: >> BTW ... what were the last versions you were running on which you had >> *not* seen the problem? (Just wondering about the possibility that we >> back-patched some "fix" that broke things. It would be good to have >> a

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Wed, Sep 07, 2011 at 07:39:15PM -0400, Tom Lane wrote: > daveg writes: > > Also, this is very intermittant, we have seen it only in recent months > > on both 8.4.7 and 9.0.4 after years of no problems. Lately we see it what > > feels like a few times a month. Possibly some new application behav

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Tom Lane
daveg writes: > Also, this is very intermittant, we have seen it only in recent months > on both 8.4.7 and 9.0.4 after years of no problems. Lately we see it what > feels like a few times a month. Possibly some new application behaviour > is provoking it, but I have no guesses as to what. BTW ...

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Tom Lane
daveg writes: > On Wed, Sep 07, 2011 at 06:25:23PM -0400, Tom Lane wrote: >> ... But maybe it'd be interesting for Dave to stick a >> LockReleaseAll call into ProcKill() and see if that makes things better. >> (Dave: test that before you put it in production, I'm not totally sure >> it's safe.)

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Wed, Sep 07, 2011 at 06:25:23PM -0400, Tom Lane wrote: > Robert Haas writes: > > I thought about an error exit from client authentication, and that's a > > somewhat appealing explanation, but I can't quite see why we wouldn't > > clean up there the same as anywhere else. The whole mechanism fe

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Wed, Sep 07, 2011 at 06:35:08PM -0400, Tom Lane wrote: > daveg writes: > > It does not seem restricted to pg_authid: > > 2011-08-24 18:35:57.445 24987 c23 apps ERROR: lock AccessShareLock on > > object 16403/2615/0 > > And I think I've seen it on other tables too. > > Hmm. 2615 = pg_nam

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Tom Lane
daveg writes: > It does not seem restricted to pg_authid: > 2011-08-24 18:35:57.445 24987 c23 apps ERROR: lock AccessShareLock on > object 16403/2615/0 > And I think I've seen it on other tables too. Hmm. 2615 = pg_namespace, which most likely is the first catalog accessed by just about an

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Tom Lane
Robert Haas writes: > I thought about an error exit from client authentication, and that's a > somewhat appealing explanation, but I can't quite see why we wouldn't > clean up there the same as anywhere else. The whole mechanism feels a > bit rickety to me - we don't actually release locks; we ju

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Wed, Sep 07, 2011 at 04:55:24PM -0400, Tom Lane wrote: > Robert Haas writes: > > Tom's right to be skeptical of my theory, because it would require a > > CHECK_FOR_INTERRUPTS() outside of a transaction block in one of the > > pathways that use session-level locks, and I can't find one. > > Mor

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Robert Haas
On Wed, Sep 7, 2011 at 4:55 PM, Tom Lane wrote: > Yeah, and for that matter it seems to let VACUUM off the hook too. > If we assume that the reported object ID is non-corrupt (and if it's > always the same, that seems like the way to bet) then this is a lock > on pg_authid. > > Hmmm ... could the

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Tom Lane
Robert Haas writes: > Tom's right to be skeptical of my theory, because it would require a > CHECK_FOR_INTERRUPTS() outside of a transaction block in one of the > pathways that use session-level locks, and I can't find one. More to the point: session-level locks are released on error. The only w

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Robert Haas
On Wed, Sep 7, 2011 at 4:22 PM, daveg wrote: > Yes, we make extensive use of advisory locks. That was my thought too when > Robert mentioned session level locks. > > I'm happy to add any additional instrumentation, but my client would be > happier to actually run it if there was a way to recover f

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Wed, Sep 07, 2011 at 10:20:24AM -0400, Tom Lane wrote: > Robert Haas writes: > > After spending some time staring at the code, I do have one idea as to > > what might be going on here. When a backend is terminated, > > ShutdownPostgres() calls AbortOutOfAnyTransaction() and then > > LockReleas

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Tom Lane
Robert Haas writes: > After spending some time staring at the code, I do have one idea as to > what might be going on here. When a backend is terminated, > ShutdownPostgres() calls AbortOutOfAnyTransaction() and then > LockReleaseAll(USER_LOCKMETHOD, true). The second call releases all > user lo

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread Robert Haas
On Wed, Sep 7, 2011 at 5:16 AM, daveg wrote: > On Tue, Aug 23, 2011 at 12:15:23PM -0400, Robert Haas wrote: >> On Mon, Aug 22, 2011 at 3:31 AM, daveg wrote: >> > So far I've got: >> > >> >  - affects system tables >> >  - happens very soon after process startup >> >  - in 8.4.7 and 9.0.4 >> >  -

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Tue, Aug 23, 2011 at 12:15:23PM -0400, Robert Haas wrote: > On Mon, Aug 22, 2011 at 3:31 AM, daveg wrote: > > So far I've got: > > > >  - affects system tables > >  - happens very soon after process startup > >  - in 8.4.7 and 9.0.4 > >  - not likely to be hardware or OS related > >  - happens

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-08-23 Thread Robert Haas
On Mon, Aug 22, 2011 at 3:31 AM, daveg wrote: > So far I've got: > >  - affects system tables >  - happens very soon after process startup >  - in 8.4.7 and 9.0.4 >  - not likely to be hardware or OS related >  - happens in clusters for period of a few second to many minutes > > I'll work on print

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-08-22 Thread daveg
On Fri, Aug 12, 2011 at 04:19:37PM -0700, daveg wrote: > > This seems to be bug month for my client. Now there are seeing periods > where all new connections fail immediately with the error: > >FATAL: lock AccessShareLock on object 0/1260/0 is already held > > This happens on postgresql 8.