We have seen a couple instances recently of WAL recovery failing due to
the recently added code that validates a page header as soon as the page
is read in, for example Olivier Prenant's crash report here:

http://archives.postgresql.org/pgsql-hackers/2003-10/msg01505.php

This failure is actually entirely pointless, because (AFAIK) any page
that is brought in during WAL recovery is going to be overwritten in
toto from the WAL log.  So it would be safe to run WAL recovery with
zero_damaged_pages enabled.  Rather than expecting DBAs to think of that
under the stress of a crashed-database situation, I propose that we do
it for them:

*** src/backend/storage/buffer/bufmgr.c.orig    Fri Nov 21 12:41:31 2003
--- src/backend/storage/buffer/bufmgr.c Sat Nov 29 13:35:14 2003
***************
*** 231,237 ****
                if (status == SM_SUCCESS &&
                        !PageHeaderIsValid((PageHeader) MAKE_PTR(bufHdr->data)))
                {
!                       if (zero_damaged_pages)
                        {
                                ereport(WARNING,
                                                (errcode(ERRCODE_DATA_CORRUPTED),
--- 231,237 ----
                if (status == SM_SUCCESS &&
                        !PageHeaderIsValid((PageHeader) MAKE_PTR(bufHdr->data)))
                {
!                       if (zero_damaged_pages || InRecovery)
                        {
                                ereport(WARNING,
                                                (errcode(ERRCODE_DATA_CORRUPTED),


Comments?

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Reply via email to