Re: [HACKERS] Hanging backends and possible index corruption

2013-01-26 Thread Bernd Helmle



--On 25. Januar 2013 20:37:32 -0500 Tom Lane t...@sss.pgh.pa.us wrote:



Don't know how careful pgbtreecheck is.  The pg_filedump output isn't
very helpful because you filtered away the flags, so we can't tell if
any of these pages are deleted.  (If they are, the duplicate-looking
links might not be errors, since we intentionally don't reset a deleted
page's left/right links when deleting it.)



Ah, wasn't aware of this.


Could we see the whole special-space dump for each of the pages you're
worried about?



Attached


One thought that occurs to me is that POWER is a weak-memory-ordering
architecture, so that it's a tenable idea that this has something to do
with changing page links while not holding sufficient lock on the page.
I don't see btree doing that anywhere, but ...

BTW, how long has this installation been around, and when did you start
seeing funny behavior?  Can you say with reasonable confidence that the
bug was *not* present in any older PG versions?



This machine started in production around august last year with (afair) 
9.1.5. There also were performance and stress tests on this machine before 
it went into production, with no noticable problems.


However, what i missed before is that there were some trouble with the 
storage multipaths. Seems early january the machine lost some of it's paths 
to the SAN, but they were recovered a few seconds later...so i cannot 
exclude this as the cause anymore, though the paths are redundant. What 
strikes me is that the index was recreated in the meantime after this 
issue...


We will watch this machine the next couple of weeks more closely, if the 
issue comes back again.


--
Thanks

Bernd

dump.txt.bz2
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Hanging backends and possible index corruption

2013-01-25 Thread Bernd Helmle
We are currently analyzing an issue at one of our customers PostgreSQL 
database.


The current version is 9.1.6 (update to 9.1.7 is scheduled for next monday, 
no downtime possible before). It runs on POWER7 (pSeries 740) on an RHEL6.3 
64-bit LPAR. The packages are built from PGDG SVN sources, no special 
tweaks added. We saw no hardware related errors on this machine, nor any 
crashes.


What currently happens on this machine are hanging statements (SELECTs and 
INSERTs occasionally) with 100% CPU. After some investigation it turned out 
that the corresponding backends are seeking within an index file over and 
over again in a loop. Looking into the hanging queries i've recognized 
certain keys which seems to have the problem, other keys used in the WHERE 
condition run smoothly. Turning off index and bitmap index scans caused the 
suspicious keys to return results, too.


So i've saved the index file (normal BTree index with a single bigint 
column), did a REINDEX and the problem was gone. Looking at the index file 
with pg_filedump and pgbtreecheck from Alvaro gave me the following output:


pgbtreecheck gives warnings about pages' parents and then loops visiting 
the same pages over and over again:


NOTICE: fast root: block 290 at level 2
NOTICE: setting page parents
WARNING: block 12709 already had a parent (8840); new parent is 12177
WARNING: block 12710 already had a parent (12439); new parent is 10835
NOTICE: done setting parents
NOTICE: Checking forward scan of level 0, starting at block 1

-- loop starts

WARNING: right sibling 12710 does not point left to me (11680); points to 
10924 instead


Looking into the relevant pages and their prev and next pointers give me 
the following:


pg_filedump -i -R 11680 ~/tmp/100252789 | grep Blocks
 Blocks: Previous (12651)  Next (12710)  Level (0)  CycleId (0)

pg_filedump -i -R 10924 ~/tmp/100252789 | grep Blocks
 Blocks: Previous (10923)  Next (12710)  Level (0)  CycleId (0)

pg_filedump -i -R 12710 ~/tmp/100252789 | grep Blocks
 Blocks: Previous (10924)  Next (10925)  Level (0)  CycleId (0)

pg_filedump -i -R 10925 ~/tmp/100252789 | grep Blocks
 Blocks: Previous (12710)  Next (10926)  Level (0)  CycleId (0)

pg_filedump -i -R 12709 ~/tmp/100252789 | grep Blocks
 Blocks: Previous (8849)  Next (8850)  Level (0)  CycleId (0)

pg_filedump -i -R 8840 ~/tmp/100252789 | grep Blocks
 Blocks: Previous (8555)  Next (9125)  Level (1)  CycleId (0)

pg_filedump -i -R 12439 ~/tmp/100252789 | grep Blocks
 Blocks: Previous (11405)  Next (11690)  Level (1)  CycleId (0)

$ pg_filedump -i -R 12177 ~/tmp/100252789 | grep Blocks
 Blocks: Previous (11690)  Next (0)  Level (1)  CycleId (0)

$ pg_filedump -i -R 10835 ~/tmp/100252789 | grep Blocks
 Blocks: Previous (10550)  Next (11120)  Level (1)  CycleId (0)

This symptom happened three times in the last couple of weeks now. Looking 
at the numbers doesn't give me the impression that some flaky hardware 
could be involved. What else can we do to track down this problem, any 
suggestions?


--
Thanks

Bernd


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hanging backends and possible index corruption

2013-01-25 Thread Andres Freund
On 2013-01-25 16:24:52 +0100, Bernd Helmle wrote:
 We are currently analyzing an issue at one of our customers PostgreSQL
 database.
 
 The current version is 9.1.6 (update to 9.1.7 is scheduled for next monday,
 no downtime possible before). It runs on POWER7 (pSeries 740) on an RHEL6.3
 64-bit LPAR. The packages are built from PGDG SVN sources, no special tweaks
 added. We saw no hardware related errors on this machine, nor any crashes.
 
 What currently happens on this machine are hanging statements (SELECTs and
 INSERTs occasionally) with 100% CPU. After some investigation it turned out
 that the corresponding backends are seeking within an index file over and
 over again in a loop. Looking into the hanging queries i've recognized
 certain keys which seems to have the problem, other keys used in the WHERE
 condition run smoothly. Turning off index and bitmap index scans caused the
 suspicious keys to return results, too.
 
 So i've saved the index file (normal BTree index with a single bigint
 column), did a REINDEX and the problem was gone. Looking at the index file
 with pg_filedump and pgbtreecheck from Alvaro gave me the following output:
 
 pgbtreecheck gives warnings about pages' parents and then loops visiting the
 same pages over and over again:
 
 NOTICE: fast root: block 290 at level 2
 NOTICE: setting page parents
 WARNING: block 12709 already had a parent (8840); new parent is 12177
 WARNING: block 12710 already had a parent (12439); new parent is 10835
 NOTICE: done setting parents
 NOTICE: Checking forward scan of level 0, starting at block 1
 
 -- loop starts
 
 WARNING: right sibling 12710 does not point left to me (11680); points to
 10924 instead
 
 Looking into the relevant pages and their prev and next pointers give me the
 following:
 
 pg_filedump -i -R 11680 ~/tmp/100252789 | grep Blocks
  Blocks: Previous (12651)  Next (12710)  Level (0)  CycleId (0)
 
 pg_filedump -i -R 10924 ~/tmp/100252789 | grep Blocks
  Blocks: Previous (10923)  Next (12710)  Level (0)  CycleId (0)
 
 pg_filedump -i -R 12710 ~/tmp/100252789 | grep Blocks
  Blocks: Previous (10924)  Next (10925)  Level (0)  CycleId (0)
 
 pg_filedump -i -R 10925 ~/tmp/100252789 | grep Blocks
  Blocks: Previous (12710)  Next (10926)  Level (0)  CycleId (0)
 
 pg_filedump -i -R 12709 ~/tmp/100252789 | grep Blocks
  Blocks: Previous (8849)  Next (8850)  Level (0)  CycleId (0)
 
 pg_filedump -i -R 8840 ~/tmp/100252789 | grep Blocks
  Blocks: Previous (8555)  Next (9125)  Level (1)  CycleId (0)
 
 pg_filedump -i -R 12439 ~/tmp/100252789 | grep Blocks
  Blocks: Previous (11405)  Next (11690)  Level (1)  CycleId (0)
 
 $ pg_filedump -i -R 12177 ~/tmp/100252789 | grep Blocks
  Blocks: Previous (11690)  Next (0)  Level (1)  CycleId (0)
 
 $ pg_filedump -i -R 10835 ~/tmp/100252789 | grep Blocks
  Blocks: Previous (10550)  Next (11120)  Level (1)  CycleId (0)
 
 This symptom happened three times in the last couple of weeks now. Looking
 at the numbers doesn't give me the impression that some flaky hardware could
 be involved. What else can we do to track down this problem, any
 suggestions?

Did you reindex after upgrading to 9.1.6? Did you ever have any crashes
or failovers before upgrading to 9.1.6?
I have seen pretty similar symptoms caused by Fix persistence marking
of shared buffers during WAL replay in 9.1.6.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hanging backends and possible index corruption

2013-01-25 Thread Bernd Helmle



--On 25. Januar 2013 16:28:16 +0100 Andres Freund and...@2ndquadrant.com 
wrote:



Did you reindex after upgrading to 9.1.6? Did you ever have any crashes
or failovers before upgrading to 9.1.6?
I have seen pretty similar symptoms caused by Fix persistence marking
of shared buffers during WAL replay in 9.1.6.


Hmm it happened only on a single heavily used table for now and this table 
was REINDEXed twice after updating to 9.1.6 (every time the issue occured).


--
Thanks

Bernd


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hanging backends and possible index corruption

2013-01-25 Thread Tom Lane
Bernd Helmle maili...@oopsware.de writes:
 So i've saved the index file (normal BTree index with a single bigint 
 column), did a REINDEX and the problem was gone. Looking at the index file 
 with pg_filedump and pgbtreecheck from Alvaro gave me the following output:
 ...

Don't know how careful pgbtreecheck is.  The pg_filedump output isn't
very helpful because you filtered away the flags, so we can't tell if
any of these pages are deleted.  (If they are, the duplicate-looking
links might not be errors, since we intentionally don't reset a deleted
page's left/right links when deleting it.)

Could we see the whole special-space dump for each of the pages you're
worried about?

One thought that occurs to me is that POWER is a weak-memory-ordering
architecture, so that it's a tenable idea that this has something to do
with changing page links while not holding sufficient lock on the page.
I don't see btree doing that anywhere, but ...

BTW, how long has this installation been around, and when did you start
seeing funny behavior?  Can you say with reasonable confidence that the
bug was *not* present in any older PG versions?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers