This is a continuation of the discussion at http://www.postgresql.org/message-id/CAMkU=1zUc=h0oczntajaqqw7gxxvxcwsyq8dd2t7ohgsgve...@mail.gmail.com, I'm starting a new thread as this is a separate issue than the original LWLock bug.

On Thu, Jul 16, 2015 at 12:03 AM, Jeff Janes <jeff.ja...@gmail.com> wrote:

On Wed, Jul 15, 2015 at 8:44 AM, Heikki Linnakangas <hlinn...@iki.fi>
wrote:

I don't see how this is related to the LWLock issue, but I didn't see it
without your patch.  Perhaps the system just didn't survive long enough to
uncover it without the patch (although it shows up pretty quickly).  It
could just be an overzealous Assert, since the casserts off didn't show
problems.

bt and bt full are shown below.

Cheers,

Jeff

#0  0x0000003dcb632625 in raise () from /lib64/libc.so.6
#1  0x0000003dcb633e05 in abort () from /lib64/libc.so.6
#2  0x0000000000930b7a in ExceptionalCondition (
    conditionName=0x9a1440 "!(((PageHeader) (page))->pd_special >=
(__builtin_offsetof (PageHeaderData, pd_linp)))", errorType=0x9a12bc
"FailedAssertion",
    fileName=0x9a12b0 "ginvacuum.c", lineNumber=713) at assert.c:54
#3  0x00000000004947cf in ginvacuumcleanup (fcinfo=0x7fffee073a90) at
ginvacuum.c:713


It now looks like this *is* unrelated to the LWLock issue.  The assert that
it is tripping over was added just recently (302ac7f27197855afa8c) and so I
had not been testing under its presence until now.  It looks like it is
finding all-zero pages (index extended but then a crash before initializing
the page?) and it doesn't like them.

(gdb) f 3
(gdb) p *(char[8192]*)(page)
$11 = '\000' <repeats 8191 times>

Presumably before this assert, such pages would just be permanently
orphaned.

Yeah, so it seems. It's normal to have all-zero pages in the index, if you crash immediately after the relation has been extended, but before the new page has been WAL-logged. What is your test case like; did you do crash-testing?

ISTM ginvacuumcleanup should check for PageIsNew, and put the page to the FSM. That's what btvacuumpage() gistvacuumcleanup() do. spgvacuumpage() seems to also check for PageIsNew(), but it seems broken in a different way: it initializes the page and marks the page as dirty, but it is not WAL-logged. That is a problem at least if checksums are enabled: if you crash you might have a torn page on disk, with invalid checksum.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to