Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-16 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Aug 11, 2011 at 5:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 What's bothering me at the moment is that the CLOBBER_CACHE_ALWAYS hack,
 which was meant to expose exactly this sort of problem, failed to do so
 --- buildfarm member jaguar has been running with that flag for ages and
 never showed this problem.  I'm thinking that we should take out the
 hack in AcceptInvalidationMessages and instead put in #ifdeffed code
 that causes ReceiveSharedInvalidMessages to forcibly always call the
 reset function.  Any thoughts about that?

 I'm OK with that, but while we're whacking this around, is there any
 chance that we could reduce the number of layers of function calls
 that are happening in here?

I'm not excited about that.  We have got three layers that each have
significant complexities of their own to deal with.  The only way to
reduce the layer count is to make some piece of code deal with more of
those complexities at once.  I don't want to go there, especially not
in the name of very marginal cycle-shaving.

 It would be nice to move the short-circuit test I recently inserted at
 the top of SIGetDataEntries() somewhere higher up in the call stack,
 but right now the layers of abstraction are so thick that it's not
 exactly clear how to do that.

Glad you didn't try, because it would be wrong.  The fact that there
are no outstanding messages in sinvaladt.c doesn't prove there are none
yet unprocessed inside ReceiveSharedInvalidMessages.


Anyway, to get back to the original point: I'm getting less excited
about redoing the CLOBBER_CACHE_ALWAYS code at a different level.
The only thing that can really promise is that we find places outside
the cache code that are mistakenly holding onto cache entry pointers.
It can't do very much for the sort of problems I've been finding over
the past week, first because you need some other process actively
changing the underlying information (else the use of a stale cache entry
isn't a problem), and second because when you're looking for bugs
involving use of stale cache entries, over-enthusiastic flushing of the
cache can actually mask the bugs.

I'd still like to think of a better test methodology, but I don't think
force clobbers inside ReceiveSharedInvalidMessages is it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-16 Thread Robert Haas
On Tue, Aug 16, 2011 at 3:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 It would be nice to move the short-circuit test I recently inserted at
 the top of SIGetDataEntries() somewhere higher up in the call stack,
 but right now the layers of abstraction are so thick that it's not
 exactly clear how to do that.

 Glad you didn't try, because it would be wrong.  The fact that there
 are no outstanding messages in sinvaladt.c doesn't prove there are none
 yet unprocessed inside ReceiveSharedInvalidMessages.

Oh.  Good point.

 Anyway, to get back to the original point: I'm getting less excited
 about redoing the CLOBBER_CACHE_ALWAYS code at a different level.
 The only thing that can really promise is that we find places outside
 the cache code that are mistakenly holding onto cache entry pointers.
 It can't do very much for the sort of problems I've been finding over
 the past week, first because you need some other process actively
 changing the underlying information (else the use of a stale cache entry
 isn't a problem), and second because when you're looking for bugs
 involving use of stale cache entries, over-enthusiastic flushing of the
 cache can actually mask the bugs.

 I'd still like to think of a better test methodology, but I don't think
 force clobbers inside ReceiveSharedInvalidMessages is it.

Makes sense.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-15 Thread daveg
[adding back hackers so the thread shows the resolution]

On Sun, Aug 14, 2011 at 07:02:55PM -0400, Tom Lane wrote:
 Sounds good.  Based on my own testing so far, I think that patch will
 probably make things measurably better for you, though it won't resolve
 every corner case.

The most recent catalog vacuums did vacuum full pg_class in 34 databases on
that instance with no new failures. So it looks like the patch fixes it. This
is not conclusive, but it looks very good so far. I'll send an update if I
see any new errors during the week.

Thanks for your help on this. It looks like it has sent you on a merry
search through all the catcache related program activities. I'm assuming
this patch or some improvement on it will show up in a point release.
Meanwhile, if this works as is for couple more days we will resume upgrading
the rest of the hosts to 9.0 using this patch.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-11 Thread Tom Lane
I wrote:
 I still haven't reproduced the behavior here, but I think I see what
 must be happening: we are getting an sinval reset while attempting to
 open pg_class_oid_index.

After a number of false starts, I've managed to reproduce this behavior
locally.  The above theory turns out to be wrong, or at least
incomplete.  In order to be opening pg_class_oid_index, we must already
have opened and locked pg_class, and we would have absorbed the
relmapping update for pg_class when we did that, so an sinval reset
during a subsequent relation open is too late to provoke the bug.

Rather, the key to the problem is that the sinval reset has to happen
*instead of* seeing the plain relcache inval on pg_class that was
emitted by the VACUUM FULL pg_class command.  In this case, we enter
RelationCacheInvalidate with a stale value for pg_class's relmapping,
and since that routine is not sufficiently careful about the order in
which it revalidates the nailed-relation cache entries, we may try to
read something from pg_class before we've updated pg_class's relmapping.
So the reason Dave is seeing the problem a lot must be that he has very
high sinval traffic, leading to lots of resets, and that increases the
probability of a reset replacing just the wrong sinval message.

I can reproduce the problem fairly conveniently with this crude hack:

diff --git a/src/backend/storage/ipc/sinval.c b/src/backend/storage/ipc/sinval.c
index 8499615..5ad2aee 100644
*** a/src/backend/storage/ipc/sinval.c
--- b/src/backend/storage/ipc/sinval.c
*** ReceiveSharedInvalidMessages(
*** 106,112 
/* Try to get some more messages */
getResult = SIGetDataEntries(messages, MAXINVALMSGS);
  
!   if (getResult  0)
{
/* got a reset message */
elog(DEBUG4, cache state reset);
--- 106,112 
/* Try to get some more messages */
getResult = SIGetDataEntries(messages, MAXINVALMSGS);
  
!   if (getResult != 0)
{
/* got a reset message */
elog(DEBUG4, cache state reset);

which forces every occurrence of an incoming sinval message to be treated
as a reset.  The serial regression tests still work with that, but they
fall over almost immediately if you run something like this in parallel:

while psql -c vacuum full pg_class regression; do usleep 10; done

and the parallel regression tests tend to fall over without any outside
help because of the one occurrence of vacuum full pg_class in them.

I'm inclined to think that a robust solution requires both of the ideas
I proposed last week: we should update all the relmapping entries during
pass 1 of RelationCacheInvalidate, *and* make pass 2 do things in a more
robust order.  I think pg_class, pg_class_oid_index, other nailed
relations, and then everything else ought to do it.

Anyway, that's easily fixed now that we have a reproduceable case.
What's bothering me at the moment is that the CLOBBER_CACHE_ALWAYS hack,
which was meant to expose exactly this sort of problem, failed to do so
--- buildfarm member jaguar has been running with that flag for ages and
never showed this problem.  I'm thinking that we should take out the
hack in AcceptInvalidationMessages and instead put in #ifdeffed code
that causes ReceiveSharedInvalidMessages to forcibly always call the
reset function.  Any thoughts about that?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-11 Thread Robert Haas
On Thu, Aug 11, 2011 at 5:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I can reproduce the problem fairly conveniently with this crude hack:

 diff --git a/src/backend/storage/ipc/sinval.c 
 b/src/backend/storage/ipc/sinval.c
 index 8499615..5ad2aee 100644
 *** a/src/backend/storage/ipc/sinval.c
 --- b/src/backend/storage/ipc/sinval.c
 *** ReceiveSharedInvalidMessages(
 *** 106,112 
        /* Try to get some more messages */
        getResult = SIGetDataEntries(messages, MAXINVALMSGS);

 !       if (getResult  0)
        {
            /* got a reset message */
            elog(DEBUG4, cache state reset);
 --- 106,112 
        /* Try to get some more messages */
        getResult = SIGetDataEntries(messages, MAXINVALMSGS);

 !       if (getResult != 0)
        {
            /* got a reset message */
            elog(DEBUG4, cache state reset);

 which forces every occurrence of an incoming sinval message to be treated
 as a reset.  The serial regression tests still work with that, but they
 fall over almost immediately if you run something like this in parallel:

        while psql -c vacuum full pg_class regression; do usleep 10; done

 and the parallel regression tests tend to fall over without any outside
 help because of the one occurrence of vacuum full pg_class in them.

Nice detective work.

 I'm inclined to think that a robust solution requires both of the ideas
 I proposed last week: we should update all the relmapping entries during
 pass 1 of RelationCacheInvalidate, *and* make pass 2 do things in a more
 robust order.  I think pg_class, pg_class_oid_index, other nailed
 relations, and then everything else ought to do it.

 Anyway, that's easily fixed now that we have a reproduceable case.
 What's bothering me at the moment is that the CLOBBER_CACHE_ALWAYS hack,
 which was meant to expose exactly this sort of problem, failed to do so
 --- buildfarm member jaguar has been running with that flag for ages and
 never showed this problem.  I'm thinking that we should take out the
 hack in AcceptInvalidationMessages and instead put in #ifdeffed code
 that causes ReceiveSharedInvalidMessages to forcibly always call the
 reset function.  Any thoughts about that?

I'm OK with that, but while we're whacking this around, is there any
chance that we could reduce the number of layers of function calls
that are happening in here?

Right now, we call AcceptInvalidationMessages().
...which basically does nothing but call ReceiveSharedInvalidMessages()
...which in turn calls SIGetDataEntries() ... and then, by way of a
function pointer, LocalExecuteInvalidMessage() or perhaps
InvalidateSystemCaches().

It would be nice to move the short-circuit test I recently inserted at
the top of SIGetDataEntries() somewhere higher up in the call stack,
but right now the layers of abstraction are so thick that it's not
exactly clear how to do that.  In my ideal world, I'd like to turn
AcceptInvalidationMessages into something like this:

#define AcceptInvalidationMessages() \
do { if (*flagvar) ReallyDoIt(); } while (0)

...where flagvar is a pointer to myStateP-hasMessages.  This isn't
really a huge hotspot any more, but a cycle saved is a cycle earned,
and we call this frequently enough that it seems awfully wasteful to
push and pop three call stack frames every time we check and find no
messages.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-05 Thread Tom Lane
I wrote:
 Ahh ... you know what, never mind about stack traces, let's just see if
 the attached patch doesn't fix it.

On reflection, that patch would only fix the issue for pg_class, and
that's not the only catalog that gets consulted during relcache reloads.
I think we'd better do it as attached, instead.

regards, tom lane

diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 81cea8b..337300f 100644
*** a/src/backend/utils/cache/relcache.c
--- b/src/backend/utils/cache/relcache.c
*** RelationCacheInvalidate(void)
*** 2159,2164 
--- 2159,2169 
  	List	   *rebuildList = NIL;
  	ListCell   *l;
  
+ 	/*
+ 	 * Reload relation mapping data before starting to reconstruct cache.
+ 	 */
+ 	RelationMapInvalidateAll();
+ 
  	/* Phase 1 */
  	hash_seq_init(status, RelationIdCache);
  
*** RelationCacheInvalidate(void)
*** 2184,2189 
--- 2189,2204 
  		else
  		{
  			/*
+ 			 * If it's a mapped relation, immediately update its rd_node in
+ 			 * case its relfilenode changed.  We must do this during phase 1
+ 			 * in case the relation is consulted during rebuild of other
+ 			 * relcache entries in phase 2.  It's safe since consulting the
+ 			 * map doesn't involve any access to relcache entries.
+ 			 */
+ 			if (RelationIsMapped(relation))
+ RelationInitPhysicalAddr(relation);
+ 
+ 			/*
  			 * Add this entry to list of stuff to rebuild in second pass.
  			 * pg_class_oid_index goes on the front of rebuildFirstList, other
  			 * nailed indexes on the back, and everything else into
*** RelationCacheInvalidate(void)
*** 2209,2219 
  	 */
  	smgrcloseall();
  
- 	/*
- 	 * Reload relation mapping data before starting to reconstruct cache.
- 	 */
- 	RelationMapInvalidateAll();
- 
  	/* Phase 2: rebuild the items found to need rebuild in phase 1 */
  	foreach(l, rebuildFirstList)
  	{
--- 2224,2229 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-05 Thread daveg
On Fri, Aug 05, 2011 at 12:10:31PM -0400, Tom Lane wrote:
 I wrote:
  Ahh ... you know what, never mind about stack traces, let's just see if
  the attached patch doesn't fix it.
 
 On reflection, that patch would only fix the issue for pg_class, and
 that's not the only catalog that gets consulted during relcache reloads.
 I think we'd better do it as attached, instead.
 
   regards, tom lane

Should this be applied in addition to the earlier patch, or to replace it?

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-05 Thread Tom Lane
daveg da...@sonic.net writes:
 Should this be applied in addition to the earlier patch, or to replace it?

Apply it instead of the earlier one.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-04 Thread daveg
On Wed, Aug 03, 2011 at 11:18:20AM -0400, Tom Lane wrote:
 Evidently not, if it's not logging anything, but now the question is
 why.  One possibility is that for some reason RelationGetNumberOfBlocks
 is persistently lying about the file size.  (We've seen kernel bugs
 before that resulted in transiently wrong values, so this isn't totally
 beyond the realm of possibility.)  Please try the attached patch, which
 extends the previous one to add a summary line including the number of
 blocks physically scanned by the seqscan.

Ok, I have results from the latest patch and have attached a redacted
server log with the select relfilenode output added inline. This is the
shorter of the logs and shows the sequence pretty clearly. I have additional
logs if wanted.

Summary: the failing process reads 0 rows from 0 blocks from the OLD
relfilenode. 

-dg


-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.
SS.msec  pid   db   user  statement
--  -  ---    ---
03.804  29706  c27  postgres  connection authorized: user=postgres database=c27
...
03.824  29706  c27  postgres  set statement_timeout=60;
03.824  29706  c27  postgres  0.364 ms
03.825  29706  c27  postgres  select current_database() as db, relname, oid, 
pg_relation_filenode(oid) as filenode, clock_timestamp()::time as ts from 
pg_class where relname like 'pg_class%';
03.829  29706  c27  postgres  4.173 ms
 db  |  relname   | oid  |  filenode  |   ts
-++--++-
 c27 | pg_class   | 1259 | 1245833951 | 21:31:03.828293
 c27 | pg_class_oid_index | 2662 | 1245833955 | 21:31:03.828791
 c27 | pg_class_relname_nsp_index | 2663 | 1259100530 | 21:31:03.828807

03.829  29706  c27  postgres  vacuum full pg_catalog.pg_class;
03.829  29706  c27  postgres  LOCATION: exec_simple_query, postgres.c:900
...
03.845  29707  c27  postgres  disconnection: session time: 0:00:00.041 
user=postgres database=c27 host=bk-0
...
08.856  29706  c27  postgres  process 29706 still waiting for RowExclusiveLock 
on relation 1214 of database 0 after 5000.483 ms
08.856  29706  c27  postgres  LOCATION: ProcSleep, proc.c:1059
08.856  29706  c27  postgres  STATEMENT: vacuum full pg_catalog.pg_class;
09.383  29711  LOG: 0: process 29711 still waiting for AccessShareLock on 
relation 1259 of database 16408 after 5000.331 ms
09.383  29711  LOCATION: ProcSleep, proc.c:1059
...
11.559  28857  c27  apps  SELECT
...
16.560  28857  c27  apps  process 28857 still waiting for AccessShareLock 
on relation 1259 of database 16408 after 5001.209 ms
16.560  28857  c27  apps  LOCATION: ProcSleep, proc.c:1059
16.560  28857  c27  apps  STATEMENT: SELECT
...
19.763  29706  c27  postgres  process 29706 acquired RowExclusiveLock on 
relation 1214 of database 0 after 15907.284 ms
19.763  29706  c27  postgres  LOCATION: ProcSleep, proc.c:1063
19.763  29706  c27  postgres  STATEMENT: vacuum full pg_catalog.pg_class;
...
25.735  29711  LOG: 0: process 29711 acquired AccessShareLock on relation 
1259 of database 16408 after 21352.393 ms
25.735  29711  LOCATION: ProcSleep, proc.c:1063
25.735  28857  c27  apps  process 28857 acquired AccessShareLock on 
relation 1259 of database 16408 after 14176.040 ms
25.735  28857  c27  apps  LOCATION: ProcSleep, proc.c:1063
25.735  28857  c27  apps  STATEMENT: SELECT
25.736  28857  c27  apps  ScanPgRelationDetailed: found 0 tuples with OID 
2662 in 0 blocks of filenode 1245833951
25.736  28857  c27  apps  LOCATION: ScanPgRelationDetailed, relcache.c:372
25.736  28857  c27  apps  STATEMENT: SELECT
25.736  28857  c27  apps  ERROR: XX000: could not find pg_class tuple for 
index 2662
25.736  28857  c27  apps  LOCATION: RelationReloadIndexInfo, relcache.c:1816
25.736  28857  c27  apps  STATEMENT: SELECT
25.736  29706  c27  postgres  21906.865 ms
25.737  29706  c27  postgres  select current_database() as db, relname, oid, 
pg_relation_filenode(oid) as filenode, clock_timestamp()::time as ts from 
pg_class where relname like 'pg_class%';
25.767  29706  c27  postgres  30.902 ms
 db  |  relname   | oid  |  filenode  |   ts
-++--++-
 c27 | pg_class   | 1259 | 1279787837 | 21:31:25.76726
 c27 | pg_class_oid_index | 2662 | 1279788022 | 21:31:25.767764
 c27 | pg_class_relname_nsp_index | 2663 | 1279788023 | 21:31:25.767782

25.768  29706  c27  postgres  vacuum full pg_catalog.pg_attribute;
25.775  28857  c27  apps  disconnection: session time: 0:07:07.758 
user=apps database=c27 host=op-01
25.775  28857  c27  apps  LOCATION: log_disconnections, postgres.c:4339
...
30.914  29711  LOG: 0: process 29711 still waiting for AccessShareLock on 
relation 1249 of database 16408 

Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-04 Thread Tom Lane
daveg da...@sonic.net writes:
 Summary: the failing process reads 0 rows from 0 blocks from the OLD
 relfilenode. 

Hmm.  This seems to mean that we're somehow missing a relation mapping
invalidation message, or perhaps not processing it soon enough during
some complex set of invalidations.  I did some testing with that in mind
but couldn't reproduce the failure.  It'd be awfully nice to get a look
at the call stack when this happens for you ... what OS are you running?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-04 Thread daveg
On Thu, Aug 04, 2011 at 12:28:31PM -0400, Tom Lane wrote:
 daveg da...@sonic.net writes:
  Summary: the failing process reads 0 rows from 0 blocks from the OLD
  relfilenode. 
 
 Hmm.  This seems to mean that we're somehow missing a relation mapping
 invalidation message, or perhaps not processing it soon enough during
 some complex set of invalidations.  I did some testing with that in mind
 but couldn't reproduce the failure.  It'd be awfully nice to get a look
 at the call stack when this happens for you ... what OS are you running?

cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 5.5 (Tikanga)
Linux version 2.6.18-194.el5

I can use gdb as well if we can get a core or stop the correct process.
Perhaps a long sleep when it hits this?
Or perhaps we could log invalidate processing for pg_class?

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-04 Thread daveg
On Thu, Aug 04, 2011 at 12:28:31PM -0400, Tom Lane wrote:
 daveg da...@sonic.net writes:
  Summary: the failing process reads 0 rows from 0 blocks from the OLD
  relfilenode. 
 
 Hmm.  This seems to mean that we're somehow missing a relation mapping
 invalidation message, or perhaps not processing it soon enough during
 some complex set of invalidations.  I did some testing with that in mind
 but couldn't reproduce the failure.  It'd be awfully nice to get a look
 at the call stack when this happens for you ... what OS are you running?

To recap, a few observations:

When it happens the victim has recently been waiting on a lock for a
several seconds.

We create a lot of temp tables, hundreds of thousands a day.

There are catalog vacuum fulls and reindexes running on 30 odd other databases
at the same time. The script estimates the amount of bloat on each table and
index and chooses either reindex on specific indexes or vacuum full as needed.

This is a 32 core (64 with hype threading) 512GB host with several hundred
connections

We are seeing cannot read' and 'cannot open' errors too that would be
consistant with trying to use a vanished file.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-04 Thread Tom Lane
daveg da...@sonic.net writes:
 We are seeing cannot read' and 'cannot open' errors too that would be
 consistant with trying to use a vanished file.

Yeah, these all seem consistent with the idea that the failing backend
somehow missed an update for the relation mapping file.  You would get
the could not find pg_class tuple syndrome if the process was holding
an open file descriptor for the now-deleted file, and otherwise cannot
open/cannot read type errors.  And unless it later received another
sinval message for the relation mapping file, the errors would persist.

If this theory is correct then all of the file-related errors ought to
match up to recently-vacuumed mapped catalogs or indexes (those are the
ones with relfilenode = 0 in pg_class).  Do you want to expand your
logging of the VACUUM FULL actions and see if you can confirm that idea?

Since the machine is running RHEL, I think we can use glibc's
backtrace() function to get simple stack traces without too much effort.
I'll write and test a patch and send it along in a bit.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-04 Thread daveg
On Thu, Aug 04, 2011 at 04:16:08PM -0400, Tom Lane wrote:
 daveg da...@sonic.net writes:
  We are seeing cannot read' and 'cannot open' errors too that would be
  consistant with trying to use a vanished file.
 
 Yeah, these all seem consistent with the idea that the failing backend
 somehow missed an update for the relation mapping file.  You would get
 the could not find pg_class tuple syndrome if the process was holding
 an open file descriptor for the now-deleted file, and otherwise cannot
 open/cannot read type errors.  And unless it later received another
 sinval message for the relation mapping file, the errors would persist.
 
 If this theory is correct then all of the file-related errors ought to
 match up to recently-vacuumed mapped catalogs or indexes (those are the
 ones with relfilenode = 0 in pg_class).  Do you want to expand your
 logging of the VACUUM FULL actions and see if you can confirm that idea?

At your service, what would you like to see?
 
 Since the machine is running RHEL, I think we can use glibc's
 backtrace() function to get simple stack traces without too much effort.
 I'll write and test a patch and send it along in a bit.

Great.

Any point to try to capture SI events somehow?

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-04 Thread Tom Lane
daveg da...@sonic.net writes:
 On Thu, Aug 04, 2011 at 04:16:08PM -0400, Tom Lane wrote:
 If this theory is correct then all of the file-related errors ought to
 match up to recently-vacuumed mapped catalogs or indexes (those are the
 ones with relfilenode = 0 in pg_class).  Do you want to expand your
 logging of the VACUUM FULL actions and see if you can confirm that idea?

 At your service, what would you like to see?

I was thinking log the before-and-after filenode values each time you do
a VACUUM FULL, and then go through the logs to see if all the
file-related complaints refer to recently obsoleted filenodes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-04 Thread Tom Lane
Ahh ... you know what, never mind about stack traces, let's just see if
the attached patch doesn't fix it.

I still haven't reproduced the behavior here, but I think I see what
must be happening: we are getting an sinval reset while attempting to
open pg_class_oid_index.  The latter condition causes its refcount to
be above 1, which will cause RelationClearRelation to directly call
RelationReloadIndexInfo, which enables the following sequence of calls:
RelationCacheInvalidate - RelationClearRelation - RelationReloadIndexInfo.
And the problem is that RelationCacheInvalidate intentionally forces
pg_class_oid_index to be updated first.  That was okay when the code was
written, because the relcache entry for pg_class itself never really
needed any updates.  Now it does, so we have to make sure pg_class gets
updated first, *then* pg_class_oid_index (which might result in a
seqscan of pg_class), then everything else (for which we'll try to use
pg_class_oid_index to locate their pg_class tuples).

regards, tom lane

diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 81cea8b..0e4b17c 100644
*** a/src/backend/utils/cache/relcache.c
--- b/src/backend/utils/cache/relcache.c
*** RelationCacheInvalidate(void)
*** 2185,2204 
  		{
  			/*
  			 * Add this entry to list of stuff to rebuild in second pass.
! 			 * pg_class_oid_index goes on the front of rebuildFirstList, other
! 			 * nailed indexes on the back, and everything else into
! 			 * rebuildList (in no particular order).
  			 */
! 			if (relation-rd_isnailed 
! relation-rd_rel-relkind == RELKIND_INDEX)
  			{
  if (RelationGetRelid(relation) == ClassOidIndexId)
- 	rebuildFirstList = lcons(relation, rebuildFirstList);
- else
  	rebuildFirstList = lappend(rebuildFirstList, relation);
  			}
  			else
! rebuildList = lcons(relation, rebuildList);
  		}
  	}
  
--- 2185,2207 
  		{
  			/*
  			 * Add this entry to list of stuff to rebuild in second pass.
! 			 * pg_class goes on the front of rebuildFirstList,
! 			 * pg_class_oid_index goes to the back of rebuildFirstList, other
! 			 * nailed indexes go on the front of rebuildList, and everything
! 			 * else goes to the back of rebuildList.
  			 */
! 			if (RelationGetRelid(relation) == RelationRelationId)
! rebuildFirstList = lcons(relation, rebuildFirstList);
! 			else if (relation-rd_isnailed 
! 	 relation-rd_rel-relkind == RELKIND_INDEX)
  			{
  if (RelationGetRelid(relation) == ClassOidIndexId)
  	rebuildFirstList = lappend(rebuildFirstList, relation);
+ else
+ 	rebuildList = lcons(relation, rebuildList);
  			}
  			else
! rebuildList = lappend(rebuildList, relation);
  		}
  	}
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-03 Thread daveg
On Mon, Aug 01, 2011 at 01:23:49PM -0400, Tom Lane wrote:
 daveg da...@sonic.net writes:
  On Sun, Jul 31, 2011 at 11:44:39AM -0400, Tom Lane wrote:
  I think we need to start adding some instrumentation so we can get a
  better handle on what's going on in your database.  If I were to send
  you a source-code patch for the server that adds some more logging
  printout when this happens, would you be willing/able to run a patched
  build on your machine?
 
  Yes we can run an instrumented server so long as the instrumentation does
  not interfere with normal operation. However, scheduling downtime to switch
  binaries is difficult, and generally needs to be happen on a weekend, but
  sometimes can be expedited. I'll look into that.
 
 OK, attached is a patch against 9.0 branch that will re-scan pg_class
 after a failure of this sort occurs, and log what it sees in the tuple
 header fields for each tuple for the target index.  This should give us
 some useful information.  It might be worthwhile for you to also log the
 results of
 
 select relname,pg_relation_filenode(oid) from pg_class
 where relname like 'pg_class%';
 
 in your script that does VACUUM FULL, just before and after each time it
 vacuums pg_class.  That will help in interpreting the relfilenodes in
 the log output.

We have installed the patch and have encountered the error as usual.
However there is no additional output from the patch. I'm speculating
that the pg_class scan in ScanPgRelationDetailed() fails to return
tuples somehow.


I have also been trying to trace it further by reading the code, but have not
got any solid hypothesis yet. In the absence of any debugging output I've
been trying to deduce the call tree leading to the original failure. So far
it looks like this:

RelationReloadIndexInfo(Relation)
// Relation is 2662 and !rd_isvalid
pg_class_tuple = ScanPgRelation(2662, indexOK=false)  // returns NULL
pg_class_desc = heap_open(1259, ACC_SHARE)
r = relation_open(1259, ACC_SHARE) // locks oid, ensures 
RelationIsValid(r)
r = RelationIdGetRelation(1259)
r = RelationIdCacheLookup(1259)   // assume success
if !rd_isvalid:
RelationClearRelation(r, true)
RelationInitPhysicalAddr(r) // r is pg_class 
relcache

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-03 Thread Tom Lane
daveg da...@sonic.net writes:
 We have installed the patch and have encountered the error as usual.
 However there is no additional output from the patch. I'm speculating
 that the pg_class scan in ScanPgRelationDetailed() fails to return
 tuples somehow.

Evidently not, if it's not logging anything, but now the question is
why.  One possibility is that for some reason RelationGetNumberOfBlocks
is persistently lying about the file size.  (We've seen kernel bugs
before that resulted in transiently wrong values, so this isn't totally
beyond the realm of possibility.)  Please try the attached patch, which
extends the previous one to add a summary line including the number of
blocks physically scanned by the seqscan.

regards, tom lane

diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 81cea8b60406dffa7b5d278697ab5ad6cef6a3d8..589513a2d35987b2202ce1162b326db2d358b6fa 100644
*** a/src/backend/utils/cache/relcache.c
--- b/src/backend/utils/cache/relcache.c
***
*** 32,37 
--- 32,38 
  
  #include access/genam.h
  #include access/reloptions.h
+ #include access/relscan.h
  #include access/sysattr.h
  #include access/transam.h
  #include access/xact.h
***
*** 64,69 
--- 65,71 
  #include optimizer/var.h
  #include rewrite/rewriteDefine.h
  #include storage/fd.h
+ #include storage/bufmgr.h
  #include storage/lmgr.h
  #include storage/smgr.h
  #include utils/array.h
*** ScanPgRelation(Oid targetRelId, bool ind
*** 310,315 
--- 312,388 
  }
  
  /*
+  *		ScanPgRelationDetailed
+  *
+  *		Try to figure out why we failed to locate row for relation.
+  */
+ static HeapTuple
+ ScanPgRelationDetailed(Oid targetRelId)
+ {
+ 	HeapTuple	pg_class_tuple;
+ 	Relation	pg_class_desc;
+ 	SysScanDesc pg_class_scan;
+ 	ScanKeyData key[1];
+ 	int			count = 0;
+ 
+ 	/*
+ 	 * form a scan key
+ 	 */
+ 	ScanKeyInit(key[0],
+ ObjectIdAttributeNumber,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(targetRelId));
+ 
+ 	/*
+ 	 * Open pg_class and fetch tuples, forcing heap scan and disabling
+ 	 * visibility checks.
+ 	 */
+ 	pg_class_desc = heap_open(RelationRelationId, AccessShareLock);
+ 	pg_class_scan = systable_beginscan(pg_class_desc, ClassOidIndexId,
+ 	   false,
+ 	   SnapshotAny,
+ 	   1, key);
+ 
+ 	while (HeapTupleIsValid((pg_class_tuple = systable_getnext(pg_class_scan
+ 	{
+ 		Buffer		buf = pg_class_scan-scan-rs_cbuf;
+ 		bool		valid;
+ 
+ 		count++;
+ 
+ 		/* need buffer lock to call HeapTupleSatisfiesVisibility */
+ 		LockBuffer(buf, BUFFER_LOCK_SHARE);
+ 		valid = HeapTupleSatisfiesVisibility(pg_class_tuple,
+ 			 SnapshotNow,
+ 			 buf);
+ 		LockBuffer(buf, BUFFER_LOCK_UNLOCK);
+ 
+ 		elog(LOG, searching %u for pg_class tuple for index %u: found ctid (%u,%u), xmin %u, xmax %u, flags 0x%4x 0x%4x, valid %d,
+ 			 pg_class_desc-rd_node.relNode,
+ 			 targetRelId,
+ 			 ItemPointerGetBlockNumber((pg_class_tuple-t_self)),
+ 			 ItemPointerGetOffsetNumber((pg_class_tuple-t_self)),
+ 			 HeapTupleHeaderGetXmin(pg_class_tuple-t_data),
+ 			 HeapTupleHeaderGetXmax(pg_class_tuple-t_data),
+ 			 pg_class_tuple-t_data-t_infomask,
+ 			 pg_class_tuple-t_data-t_infomask2,
+ 			 valid);
+ 	}
+ 
+ 	elog(LOG, ScanPgRelationDetailed: found %d tuples with OID %u in %u blocks of filenode %u,
+ 		 count,
+ 		 targetRelId,
+ 		 pg_class_scan-scan-rs_nblocks,
+ 		 pg_class_desc-rd_node.relNode);
+ 
+ 	/* all done */
+ 	systable_endscan(pg_class_scan);
+ 	heap_close(pg_class_desc, AccessShareLock);
+ 
+ 	return NULL;
+ }
+ 
+ /*
   *		AllocateRelationDesc
   *
   *		This is used to allocate memory for a new relation descriptor
*** RelationReloadIndexInfo(Relation relatio
*** 1737,1744 
  	indexOK = (RelationGetRelid(relation) != ClassOidIndexId);
  	pg_class_tuple = ScanPgRelation(RelationGetRelid(relation), indexOK);
  	if (!HeapTupleIsValid(pg_class_tuple))
! 		elog(ERROR, could not find pg_class tuple for index %u,
! 			 RelationGetRelid(relation));
  	relp = (Form_pg_class) GETSTRUCT(pg_class_tuple);
  	memcpy(relation-rd_rel, relp, CLASS_TUPLE_SIZE);
  	/* Reload reloptions in case they changed */
--- 1810,1824 
  	indexOK = (RelationGetRelid(relation) != ClassOidIndexId);
  	pg_class_tuple = ScanPgRelation(RelationGetRelid(relation), indexOK);
  	if (!HeapTupleIsValid(pg_class_tuple))
! 	{
! 		pg_class_tuple = ScanPgRelationDetailed(RelationGetRelid(relation));
! 		if (!HeapTupleIsValid(pg_class_tuple))
! 			elog(ERROR, could not find pg_class tuple for index %u,
!  RelationGetRelid(relation));
! 		else
! 			elog(LOG, could not find pg_class tuple for index %u, but succeeded on second try,
!  RelationGetRelid(relation));
! 	}
  	relp = (Form_pg_class) GETSTRUCT(pg_class_tuple);
  	memcpy(relation-rd_rel, relp, CLASS_TUPLE_SIZE);
  	/* Reload reloptions in case they changed */

-- 
Sent via pgsql-hackers mailing list 

Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-01 Thread Tom Lane
daveg da...@sonic.net writes:
 On Sun, Jul 31, 2011 at 11:44:39AM -0400, Tom Lane wrote:
 I think we need to start adding some instrumentation so we can get a
 better handle on what's going on in your database.  If I were to send
 you a source-code patch for the server that adds some more logging
 printout when this happens, would you be willing/able to run a patched
 build on your machine?

 Yes we can run an instrumented server so long as the instrumentation does
 not interfere with normal operation. However, scheduling downtime to switch
 binaries is difficult, and generally needs to be happen on a weekend, but
 sometimes can be expedited. I'll look into that.

OK, attached is a patch against 9.0 branch that will re-scan pg_class
after a failure of this sort occurs, and log what it sees in the tuple
header fields for each tuple for the target index.  This should give us
some useful information.  It might be worthwhile for you to also log the
results of

select relname,pg_relation_filenode(oid) from pg_class
where relname like 'pg_class%';

in your script that does VACUUM FULL, just before and after each time it
vacuums pg_class.  That will help in interpreting the relfilenodes in
the log output.

 My observations so far are:

  - the error occurs at commit of vacuum full of pg_class
  - in these cases error hits autovacuum after it waited for a lock on pg_class
  - in these two cases there was a new process startup while the vacuum was
running. Don't know if this is relevant.

Interesting.  We'll want to know whether that happens every time.

  - while these hit autovacuum, the error does hit other processs (just not in
these sessions).  Unknown if autovacuum is a required component.

Good question.  Please consider setting log_autovacuum_min_duration = 0
so that the log also traces all autovacuum activity.

regards, tom lane

diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 81cea8b60406dffa7b5d278697ab5ad6cef6a3d8..d7957ba4e0055a4f9f7721e06b72cbeb0ddb350e 100644
*** a/src/backend/utils/cache/relcache.c
--- b/src/backend/utils/cache/relcache.c
***
*** 32,37 
--- 32,38 
  
  #include access/genam.h
  #include access/reloptions.h
+ #include access/relscan.h
  #include access/sysattr.h
  #include access/transam.h
  #include access/xact.h
***
*** 64,69 
--- 65,71 
  #include optimizer/var.h
  #include rewrite/rewriteDefine.h
  #include storage/fd.h
+ #include storage/bufmgr.h
  #include storage/lmgr.h
  #include storage/smgr.h
  #include utils/array.h
*** ScanPgRelation(Oid targetRelId, bool ind
*** 310,315 
--- 312,379 
  }
  
  /*
+  *		ScanPgRelationDetailed
+  *
+  *		Try to figure out why we failed to locate row for relation.
+  */
+ static HeapTuple
+ ScanPgRelationDetailed(Oid targetRelId)
+ {
+ 	HeapTuple	pg_class_tuple;
+ 	Relation	pg_class_desc;
+ 	SysScanDesc pg_class_scan;
+ 	ScanKeyData key[1];
+ 
+ 	/*
+ 	 * form a scan key
+ 	 */
+ 	ScanKeyInit(key[0],
+ ObjectIdAttributeNumber,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(targetRelId));
+ 
+ 	/*
+ 	 * Open pg_class and fetch tuples, forcing heap scan and disabling
+ 	 * visibility checks.
+ 	 */
+ 	pg_class_desc = heap_open(RelationRelationId, AccessShareLock);
+ 	pg_class_scan = systable_beginscan(pg_class_desc, ClassOidIndexId,
+ 	   false,
+ 	   SnapshotAny,
+ 	   1, key);
+ 
+ 	while (HeapTupleIsValid((pg_class_tuple = systable_getnext(pg_class_scan
+ 	{
+ 		Buffer		buf = pg_class_scan-scan-rs_cbuf;
+ 		bool		valid;
+ 
+ 		/* need buffer lock to call HeapTupleSatisfiesVisibility */
+ 		LockBuffer(buf, BUFFER_LOCK_SHARE);
+ 		valid = HeapTupleSatisfiesVisibility(pg_class_tuple,
+ 			 SnapshotNow,
+ 			 buf);
+ 		LockBuffer(buf, BUFFER_LOCK_UNLOCK);
+ 
+ 		elog(LOG, searching %u for pg_class tuple for index %u: found ctid (%u,%u), xmin %u, xmax %u, flags 0x%4x 0x%4x, valid %d,
+ 			 pg_class_desc-rd_node.relNode,
+ 			 targetRelId,
+ 			 ItemPointerGetBlockNumber((pg_class_tuple-t_self)),
+ 			 ItemPointerGetOffsetNumber((pg_class_tuple-t_self)),
+ 			 HeapTupleHeaderGetXmin(pg_class_tuple-t_data),
+ 			 HeapTupleHeaderGetXmax(pg_class_tuple-t_data),
+ 			 pg_class_tuple-t_data-t_infomask,
+ 			 pg_class_tuple-t_data-t_infomask2,
+ 			 valid);
+ 	}
+ 
+ 	/* all done */
+ 	systable_endscan(pg_class_scan);
+ 	heap_close(pg_class_desc, AccessShareLock);
+ 
+ 	return NULL;
+ }
+ 
+ /*
   *		AllocateRelationDesc
   *
   *		This is used to allocate memory for a new relation descriptor
*** RelationReloadIndexInfo(Relation relatio
*** 1737,1744 
  	indexOK = (RelationGetRelid(relation) != ClassOidIndexId);
  	pg_class_tuple = ScanPgRelation(RelationGetRelid(relation), indexOK);
  	if (!HeapTupleIsValid(pg_class_tuple))
! 		elog(ERROR, could not find pg_class tuple for index %u,
! 			 RelationGetRelid(relation));
  	relp = 

Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-31 Thread daveg
On Thu, Jul 28, 2011 at 11:31:31PM -0700, daveg wrote:
 On Thu, Jul 28, 2011 at 07:45:01PM -0400, Robert Haas wrote:
  REINDEX.  My guess is that this is happening either right around the
  time the VACUUM FULL commits or right around the time the REINDEX
  commits.  It'd be helpful to know which, if you can figure it out.
 
 I'll update my vacuum script to skip reindexes after vacuum full for 9.0
 servers and see if that makes the problem go away. Thanks for reminding
 me that they are not needed. However, I suspect it is the vacuum, not the
 reindex causing the problem. I'll update when I know.

Here is the update: the problem happens with vacuum full alone, no reindex
is needed to trigger it. I updated the script to avoid reindexing after
vacuum. Over the past two days there are still many ocurrances of this
error coincident with the vacuum.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-31 Thread Tom Lane
daveg da...@sonic.net writes:
 Here is the update: the problem happens with vacuum full alone, no reindex
 is needed to trigger it. I updated the script to avoid reindexing after
 vacuum. Over the past two days there are still many ocurrances of this
 error coincident with the vacuum.

Well, that jives with the assumption that the one case we saw in
the buildfarm was the same thing, because the regression tests were
certainly only doing a VACUUM FULL and not a REINDEX of pg_class.
But it doesn't get us much closer to understanding what's happening.
In particular, it seems to knock out most ideas associated with race
conditions, because the VAC FULL should hold exclusive lock on pg_class
until it's completely done (including index rebuilds).

I think we need to start adding some instrumentation so we can get a
better handle on what's going on in your database.  If I were to send
you a source-code patch for the server that adds some more logging
printout when this happens, would you be willing/able to run a patched
build on your machine?

(BTW, just to be perfectly clear ... the could not find pg_class tuple
errors always mention index 2662, right, never any other number?)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-31 Thread daveg
On Sun, Jul 31, 2011 at 11:44:39AM -0400, Tom Lane wrote:
 daveg da...@sonic.net writes:
  Here is the update: the problem happens with vacuum full alone, no reindex
  is needed to trigger it. I updated the script to avoid reindexing after
  vacuum. Over the past two days there are still many ocurrances of this
  error coincident with the vacuum.
 
 Well, that jives with the assumption that the one case we saw in
 the buildfarm was the same thing, because the regression tests were
 certainly only doing a VACUUM FULL and not a REINDEX of pg_class.
 But it doesn't get us much closer to understanding what's happening.
 In particular, it seems to knock out most ideas associated with race
 conditions, because the VAC FULL should hold exclusive lock on pg_class
 until it's completely done (including index rebuilds).
 
 I think we need to start adding some instrumentation so we can get a
 better handle on what's going on in your database.  If I were to send
 you a source-code patch for the server that adds some more logging
 printout when this happens, would you be willing/able to run a patched
 build on your machine?

Yes we can run an instrumented server so long as the instrumentation does
not interfere with normal operation. However, scheduling downtime to switch
binaries is difficult, and generally needs to be happen on a weekend, but
sometimes can be expedited. I'll look into that.

 (BTW, just to be perfectly clear ... the could not find pg_class tuple
 errors always mention index 2662, right, never any other number?)

Yes, only index 2662, never any other.

I'm attaching a somewhat redacted log for two different databases on the same
instance around the time of vacuum full of pg_class in each database.
My observations so far are:

 - the error occurs at commit of vacuum full of pg_class
 - in these cases error hits autovacuum after it waited for a lock on pg_class
 - in these two cases there was a new process startup while the vacuum was
   running. Don't know if this is relevant.
 - while these hit autovacuum, the error does hit other processs (just not in
   these sessions).  Unknown if autovacuum is a required component.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.
NOTES:
-- most statements for db c01 in one catalog vacuum session
-- database c01 is oid 16404
-- applications run as user 'app'
-- vacuums run as user postgres
-- somewhat redacted

MM:SS.ms   piduser  log text
-  -    -

03:00.350  2716   c01  postgres  vacuum full pg_catalog.pg_class;
03:00.507  2866  LOG: process 2866 acquired ShareUpdateExclusiveLock on 
relation 2604 of database 16416 after 10143.750 ms
03:01.222  21685  c01  app   11100.670 ms
03:01.223  21685  c01  app  SELECT DISTINCT
03:01.224  21685  c01  app   1.524 ms
03:02.031  3001   c01  app  connection authorized: user=app database=c01
03:04.952  2787  LOG: process 2787 still waiting for AccessShareLock on 
relation 1259 of database 16412 after 5001.035 ms
03:05.065  2787  LOG: process 2787 acquired AccessShareLock on relation 1259 of 
database 16412 after 5114.253 ms
03:05.564  2977  LOG: process 2977 still waiting for AccessShareLock on 
relation 1259 of database 16404 after 5000.970 ms
03:05.640  2731  LOG: process 2731 still waiting for RowExclusiveLock on 
relation 1259 of database 16404 after 5000.186 ms
03:06.045  2977  LOG: process 2977 acquired AccessShareLock on relation 1259 of 
database 16404 after 5482.389 ms
03:06.045  2731  LOG: process 2731 acquired RowExclusiveLock on relation 1259 
of database 16404 after 5405.652 ms
03:06.046  2731  ERROR: could not find pg_class tuple for index 2662
03:06.046  2731  CONTEXT: automatic vacuum of table c01.pg_catalog.pg_index
03:06.046  2731  ERROR: could not find pg_class tuple for index 2662
03:06.046  2716   c01  postgres   5696.537 ms
03:06.056  3001   c01  app  SET SESSION TIME ZONE 'UTC'; ...
03:06.057  2716   c01  postgres  vacuum full pg_catalog.pg_rewrite;
03:06.066  3001   c01  app   10.459 ms
03:06.091  3001   c01  app  disconnection: session time: 0:00:04.085 user=app 
database=c01 host=xxxl01
03:08.908  3006  LOG: process 3006 still waiting for AccessShareLock on 
relation 2659 of database 16407 after 5000.778 ms
03:11.777  21685  c01  app  SELECT
03:11.779  21685  c01  app   2.296 ms
03:11.779  21685  c01  app  SELECT
03:11.780  21685  c01  app   0.328 ms
03:11.798  21685  c01  app  SELECT
03:11.799  21685  c01  app   0.348 ms
03:11.800  21685  c01  app  SELECT
03:11.800  21685  c01  app   0.205 ms
03:11.804  21685  c01  app  SELECT
03:11.805  21685  c01  app   0.589 ms
03:11.806  21685  c01  app  SELECT DISTINCT
03:11.809  21685  c01  app   3.552 ms
03:11.810  21685  c01  app  SELECT
03:12.554  21685  c01  app   744.233 ms
03:12.555  21685  c01  app  SELECT DISTINCT
03:12.556  21685  c01  app   1.136 ms
03:13.153  2716   c01  postgres   

Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-29 Thread daveg
On Thu, Jul 28, 2011 at 07:45:01PM -0400, Robert Haas wrote:
 On Thu, Jul 28, 2011 at 5:46 PM, daveg da...@sonic.net wrote:
  On Thu, Jul 28, 2011 at 09:46:41AM -0400, Robert Haas wrote:
  On Wed, Jul 27, 2011 at 8:28 PM, daveg da...@sonic.net wrote:
   My client has been seeing regular instances of the following sort of 
   problem:
  On what version of PostgreSQL?
 
  9.0.4.
 
  I previously said:
   This occurs on postgresql 9.0.4. on 32 core 512GB Dell boxes. We have
   identical systems still running 8.4.8 that do not have this issue, so I'm
   assuming it is related to the vacuum full work done for 9.0. Oddly, we 
   don't
   see this on the smaller hosts (8 core, 64GB, slower cpus) running 9.0.4,
   so it may be timing related.
 
 Ah, OK, sorry.  Well, in 9.0, VACUUM FULL is basically CLUSTER, which
 means that a REINDEX is happening as part of the same operation.  In
 9.0, there's no point in doing VACUUM FULL immediately followed by
 REINDEX.  My guess is that this is happening either right around the
 time the VACUUM FULL commits or right around the time the REINDEX
 commits.  It'd be helpful to know which, if you can figure it out.

I'll update my vacuum script to skip reindexes after vacuum full for 9.0
servers and see if that makes the problem go away. Thanks for reminding
me that they are not needed. However, I suspect it is the vacuum, not the
reindex causing the problem. I'll update when I know.

 If there's not a hardware problem causing those read errors, maybe a
 backend is somehow ending up with a stale or invalid relcache entry.
 I'm not sure exactly how that could be happening, though...

It does not appear to be a hardware problem. I also suspect it is a stale
relcache.

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-29 Thread Tom Lane
daveg da...@sonic.net writes:
 On Thu, Jul 28, 2011 at 07:45:01PM -0400, Robert Haas wrote:
 Ah, OK, sorry.  Well, in 9.0, VACUUM FULL is basically CLUSTER, which
 means that a REINDEX is happening as part of the same operation.  In
 9.0, there's no point in doing VACUUM FULL immediately followed by
 REINDEX.  My guess is that this is happening either right around the
 time the VACUUM FULL commits or right around the time the REINDEX
 commits.  It'd be helpful to know which, if you can figure it out.

 I'll update my vacuum script to skip reindexes after vacuum full for 9.0
 servers and see if that makes the problem go away.

The thing that was bizarre about the one instance in the buildfarm was
that the error was persistent, ie, once a session had failed all its
subsequent attempts to access pg_class failed too.  I gather from Dave's
description that it's working that way for him too.  I can think of ways
that there might be a transient race condition against a REINDEX, but
it's very unclear why the failure would persist across multiple
attempts.  The best idea I can come up with is that the session has
somehow cached a wrong commit status for the reindexing transaction,
causing it to believe that both old and new copies of the index's
pg_class row are dead ... but how could that happen?  The underlying
state in the catalog is not wrong, because no concurrent sessions are
upset (at least not in the buildfarm case ... Dave, do you see more than
one session doing this at a time?).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-29 Thread Robert Haas
On Fri, Jul 29, 2011 at 9:55 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 daveg da...@sonic.net writes:
 On Thu, Jul 28, 2011 at 07:45:01PM -0400, Robert Haas wrote:
 Ah, OK, sorry.  Well, in 9.0, VACUUM FULL is basically CLUSTER, which
 means that a REINDEX is happening as part of the same operation.  In
 9.0, there's no point in doing VACUUM FULL immediately followed by
 REINDEX.  My guess is that this is happening either right around the
 time the VACUUM FULL commits or right around the time the REINDEX
 commits.  It'd be helpful to know which, if you can figure it out.

 I'll update my vacuum script to skip reindexes after vacuum full for 9.0
 servers and see if that makes the problem go away.

 The thing that was bizarre about the one instance in the buildfarm was
 that the error was persistent, ie, once a session had failed all its
 subsequent attempts to access pg_class failed too.  I gather from Dave's
 description that it's working that way for him too.  I can think of ways
 that there might be a transient race condition against a REINDEX, but
 it's very unclear why the failure would persist across multiple
 attempts.  The best idea I can come up with is that the session has
 somehow cached a wrong commit status for the reindexing transaction,
 causing it to believe that both old and new copies of the index's
 pg_class row are dead ... but how could that happen?  The underlying
 state in the catalog is not wrong, because no concurrent sessions are
 upset (at least not in the buildfarm case ... Dave, do you see more than
 one session doing this at a time?).

I was thinking more along the lines of a failure while processing a
sinval message emitted by the REINDEX.  The sinval message doesn't get
fully processed and therefore we get confused about what the
relfilenode is for pg_class.  If that happened for any other relation,
we could recover by scanning pg_class.  But if it happens for pg_class
or pg_class_oid_index, we're toast, because we can't scan them without
knowing what relfilenode to open.

Now that can't be quite right, because of course those are mapped
relations... and I'm pretty sure there are some other holes in my line
of reasoning too.  Just thinking out loud...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-29 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Jul 29, 2011 at 9:55 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 The thing that was bizarre about the one instance in the buildfarm was
 that the error was persistent, ie, once a session had failed all its
 subsequent attempts to access pg_class failed too.

 I was thinking more along the lines of a failure while processing a
 sinval message emitted by the REINDEX.  The sinval message doesn't get
 fully processed and therefore we get confused about what the
 relfilenode is for pg_class.  If that happened for any other relation,
 we could recover by scanning pg_class.  But if it happens for pg_class
 or pg_class_oid_index, we're toast, because we can't scan them without
 knowing what relfilenode to open.

Well, no, because the ScanPgRelation call is not failing internally.
It's performing a seqscan of pg_class and not finding a matching tuple.

You could hypothesize about maybe an sinval message got missed leading
us to scan the old (pre-VAC-FULL) copy of pg_class, but that still
wouldn't explain how come we can't find a valid-looking entry for
pg_class_oid_index in it.

Tis a puzzlement.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-29 Thread Robert Haas
On Fri, Jul 29, 2011 at 11:27 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Jul 29, 2011 at 9:55 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 The thing that was bizarre about the one instance in the buildfarm was
 that the error was persistent, ie, once a session had failed all its
 subsequent attempts to access pg_class failed too.

 I was thinking more along the lines of a failure while processing a
 sinval message emitted by the REINDEX.  The sinval message doesn't get
 fully processed and therefore we get confused about what the
 relfilenode is for pg_class.  If that happened for any other relation,
 we could recover by scanning pg_class.  But if it happens for pg_class
 or pg_class_oid_index, we're toast, because we can't scan them without
 knowing what relfilenode to open.

 Well, no, because the ScanPgRelation call is not failing internally.
 It's performing a seqscan of pg_class and not finding a matching tuple.

SnapshotNow race?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-29 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Jul 29, 2011 at 11:27 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, no, because the ScanPgRelation call is not failing internally.
 It's performing a seqscan of pg_class and not finding a matching tuple.

 SnapshotNow race?

That's what I would have guessed to start with, except that then it
wouldn't be a persistent failure.  You'd get one, but then the next
query would succeed.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-29 Thread daveg
On Fri, Jul 29, 2011 at 09:55:46AM -0400, Tom Lane wrote:
 The thing that was bizarre about the one instance in the buildfarm was
 that the error was persistent, ie, once a session had failed all its
 subsequent attempts to access pg_class failed too.  I gather from Dave's
 description that it's working that way for him too.  I can think of ways
 that there might be a transient race condition against a REINDEX, but
 it's very unclear why the failure would persist across multiple
 attempts.  The best idea I can come up with is that the session has
 somehow cached a wrong commit status for the reindexing transaction,
 causing it to believe that both old and new copies of the index's
 pg_class row are dead ... but how could that happen?  The underlying

It is definitely persistant. Once triggered the error occurs for any new
statement until the session exits.

 state in the catalog is not wrong, because no concurrent sessions are
 upset (at least not in the buildfarm case ... Dave, do you see more than
 one session doing this at a time?).

It looks like it happens to multiple sessions so far as one can tell from
the timestamps of the errors:

 timestampsessionid error
  - --
 03:05:37.434 4e26a861.4a6d could not find pg_class tuple for index 2662
 03:05:37.434 4e26a861.4a6f could not find pg_class tuple for index 2662

 03:06:12.041 4e26a731.438e could not find pg_class tuple for index 2662

 03:06:12.042 4e21b6a3.629b could not find pg_class tuple for index 2662
 03:06:12.042 4e26a723.42ec could not find pg_class tuple for index 2662 at 
character 13


-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-28 Thread Robert Haas
On Wed, Jul 27, 2011 at 8:28 PM, daveg da...@sonic.net wrote:
 My client has been seeing regular instances of the following sort of problem:

On what version of PostgreSQL?

 If simplicity worked, the world would be overrun with insects.

I thought it was...  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-28 Thread daveg
On Thu, Jul 28, 2011 at 09:46:41AM -0400, Robert Haas wrote:
 On Wed, Jul 27, 2011 at 8:28 PM, daveg da...@sonic.net wrote:
  My client has been seeing regular instances of the following sort of 
  problem:
 On what version of PostgreSQL?

9.0.4.

I previously said:
  This occurs on postgresql 9.0.4. on 32 core 512GB Dell boxes. We have
  identical systems still running 8.4.8 that do not have this issue, so I'm
  assuming it is related to the vacuum full work done for 9.0. Oddly, we don't
  see this on the smaller hosts (8 core, 64GB, slower cpus) running 9.0.4,
  so it may be timing related.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-28 Thread Robert Haas
On Thu, Jul 28, 2011 at 5:46 PM, daveg da...@sonic.net wrote:
 On Thu, Jul 28, 2011 at 09:46:41AM -0400, Robert Haas wrote:
 On Wed, Jul 27, 2011 at 8:28 PM, daveg da...@sonic.net wrote:
  My client has been seeing regular instances of the following sort of 
  problem:
 On what version of PostgreSQL?

 9.0.4.

 I previously said:
  This occurs on postgresql 9.0.4. on 32 core 512GB Dell boxes. We have
  identical systems still running 8.4.8 that do not have this issue, so I'm
  assuming it is related to the vacuum full work done for 9.0. Oddly, we 
  don't
  see this on the smaller hosts (8 core, 64GB, slower cpus) running 9.0.4,
  so it may be timing related.

Ah, OK, sorry.  Well, in 9.0, VACUUM FULL is basically CLUSTER, which
means that a REINDEX is happening as part of the same operation.  In
9.0, there's no point in doing VACUUM FULL immediately followed by
REINDEX.  My guess is that this is happening either right around the
time the VACUUM FULL commits or right around the time the REINDEX
commits.  It'd be helpful to know which, if you can figure it out.

If there's not a hardware problem causing those read errors, maybe a
backend is somehow ending up with a stale or invalid relcache entry.
I'm not sure exactly how that could be happening, though...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] error: could not find pg_class tuple for index 2662

2011-07-27 Thread daveg

My client has been seeing regular instances of the following sort of problem:

...
 03:06:09.453 exec_simple_query, postgres.c:900
 03:06:12.042 XX000: could not find pg_class tuple for index 2662 at character 
13
 03:06:12.042 RelationReloadIndexInfo, relcache.c:1740
 03:06:12.042 INSERT INTO zzz_k(k) SELECT ...
 03:06:12.045 0: statement: ABORT
 03:06:12.045 exec_simple_query, postgres.c:900
 03:06:12.045 0: duration: 0.100 ms
 03:06:12.045 exec_simple_query, postgres.c:1128
 03:06:12.046 0: statement: INSERT INTO temp_807
  VALUES (...)
 03:06:12.046 exec_simple_query, postgres.c:900
 03:06:12.046 XX000: could not find pg_class tuple for index 2662 at character 
13
 03:06:12.046 RelationReloadIndexInfo, relcache.c:1740
 03:06:12.046 INSERT INTO temp_807
  VALUES (...)
 03:06:12.096 08P01: unexpected EOF on client connection
 03:06:12.096 SocketBackend, postgres.c:348
 03:06:12.096 XX000: could not find pg_class tuple for index 2662
 03:06:12.096 RelationReloadIndexInfo, relcache.c:1740
 03:06:12.121 0: disconnection: session time: 0:06:08.537 user=ZZZ 
database=ZZZ_01
 03:06:12.121 log_disconnections, postgres.c:4339


The above happens regularly (but not completely predictably) corresponding
with a daily cronjob that checks the catalogs for bloat and does vacuum full
and/or reindex as needed. Since some of the applications make very heavy
use of temp tables this will usually mean pg_class and pg_index get vacuum
full and reindex.

Sometimes queries will fail due to being unable to open a tables containing
file. On investigation the file will be absent in both the catalogs and the
filesystem so I don't know what table it refers to:

 20:41:19.063  ERROR:  could not open file 
pg_tblspc/16401/PG_9.0_201008051/16413/1049145092: No such file or directory
 20:41:19.063  STATEMENT:  insert into r_ar__30
   select aid, mid, pid, sum(wdata) as wdata, ...
--
 20:41:19.430  ERROR:  could not open file 
pg_tblspc/16401/PG_9.0_201008051/16413/1049145092: No such file or directory
 20:41:19.430  STATEMENT: SELECT nextval('j_id_seq')


Finallly, I have seen a several instances of failure to read data by
vacuum full itself:

 03:05:45.699 0: statement: vacuum full pg_catalog.pg_index;
 03:05:45.699 exec_simple_query, postgres.c:900
 03:05:46.142 XX001: could not read block 65 in file 
pg_tblspc/16401/PG_9.0_201008051/16416/1049146489: read only 0 of 8192 bytes
 03:05:46.142 mdread, md.c:656
 03:05:46.142 vacuum full pg_catalog.pg_index;

This occurs on postgresql 9.0.4. on 32 core 512GB Dell boxes. We have
identical systems still running 8.4.8 that do not have this issue, so I'm
assuming it is related to the vacuum full work done for 9.0. Oddly, we don't
see this on the smaller hosts (8 core, 64GB, slower cpus) running 9.0.4,
so it may be timing related.

This seems possibly related to the issues in:

  Bizarre buildfarm failure on baiji: can't find pg_class_oid_index
http://archives.postgresql.org/pgsql-hackers/2010-02/msg02038.php
  Broken HOT chains in system catalogs
http://archives.postgresql.org/pgsql-hackers/2011-04/msg00777.php

As far as I can tell from the logs I have, once a session sees one of these
errors any subsequent query will hit it again until the session exits.
However, it does not seem to harm other sessions or leave any persistant
damage (crossing fingers and hoping here).

I'm ready to do any testing/investigation/instrumented builds etc that may be
helpful in resolving this.

Regards

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers