I dug through the logs and found some other occurances of 
the "could not read block" errors. Some on dirt simple 
SELECT queries. Nothing else has generated the btree 
error yet. About 35 found in the last month.

> This theory would be more plausible if you're wrong about the second-case 
> tables being temp, though, because if they are temp then their indexes 
> would be kept in local buffers not shared buffers, making it much harder 
> to believe in a single bug causing both symptoms.

I grepped the last month of logs and found about 20 instances of 
that error: none of the relfilenodes given shows up in pg_class, 
even for that dirt simple SELECT.

> One possible mechanism for confusion of that sort would be if the
> spinlock code wasn't quite right, or the compiler was incorrectly
> moving loads/stores into or out of locked sections.  So it might be
> time to ask exactly what kind of hardware this is, which compiler
> PG was built with, etc.

Quad core AMD Opteron. RHEL. Compiled with gcc with all the options 
(basically the standard compilation e.g. --build=x86_64-redhat-linux-gnu)
I can give you more details offlist if it will help.

> On the other hand, if the issue were of that sort then it ought to
> affect all buffers more or less at random; so if you're consistently
> seeing exactly these symptoms (in particular, if it's always
> pg_class_oid_index that's complained of), then I'm not sure I believe
> this theory either.

I've never seen any other index for the btree error, but it has only 
happened a grand total of 3 times ever. The other error appears to 
be fairly random, except that the one particular query that gives 
the btree error always seems to give one version or the other.

> Which PG version again?  Are you in the habit of doing VACUUM FULLs
> on system catalogs, and if so do these glitches correlate at all
> with such activities?

Heh. 8.3.18. Yes, very heavy vac fulls (and reindexes) of the system 
catalogs. Cron-driven, and depends on the time of day and if any 
DDL is running (if so, it does not run), but probably on average 
pg_class is vacfulled and reindexed twice an hour during the times 
this happens (which is, during normal business hours). There is a lot 
in churn in pg_class, pg_attribute, and pg_depend in particular from 
all the temp stuff being created and torn down all day, as well as 
some Bucardo pg_class updating.

-- 
Greg Sabino Mullane g...@endpoint.com
End Point Corporation
PGP Key: 0x14964AC8

Attachment: pgpLQFTbOf8Tw.pgp
Description: PGP signature

Reply via email to