Excerpts from Tom Lane's message of mar nov 22 01:14:33 -0300 2011: > Alvaro Herrera <alvhe...@alvh.no-ip.org> writes: > > We got a very strange nbtree corruption report some time ago. This was > > a btree index on a vey high churn table -- entries are updated and > > deleted very quickly, so the index grows very large and also shrinks > > quickly (AFAICT this is a work queue of sorts). > > > The most strange thing of all is that there was this error: > > > ERROR: left link changed unexpectedly in block 3378 of index "index_name" > > CONTEXT: automatic vacuum of table "table_name" > > > This was reported not once, but several dozens of times, by each new > > autovacuum worker that tried to vacuum the table. > > > As far as I can see, there is just no way for this to happen ... much > > less happen repeatedly. > > It's not hard to believe that that would happen repeatedly given a > corrupted set of sibling links, eg deletable page A links left to page > B, which links right to C, which links right to A. The question is how > the index got into such a state. A dropped update during a page split > would explain it (ie, B used to be A's left sibling, then at some point > B got split into B and C, but A's left-link never got updated on disk). > I wonder how reliable their disk+filesystem is ...
While summarizing this, a (relatively) frequent problem with unique indexes came to my mind: there would be a UNIQUE index but when the admin tries to rebuild it for whatever reason, duplicate values are found. We've seen dozens of reports of this kind of problem (in the pgsql lists I mean -- I don't think we've seen this problem in this customer's servers). I wonder if it's related, because it seems pretty much the same mechanism: sometimes, a btree index insert would be randomly forgotten (its page write lost in vacuum, so to speak), and thus when the second heap item comes along, there's no entry in the index and the insert completes, and there you have your duplicate value. I wonder if it would be worthwhile to build some sort of tool to scan the heap and ensure that there are index entries for all heap items, just to test the hypothesis. Not that this would enlighten on the source of the actual problem. -- Álvaro Herrera <alvhe...@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers