We have seen a couple instances recently of WAL recovery failing due to the recently added code that validates a page header as soon as the page is read in, for example Olivier Prenant's crash report here:
http://archives.postgresql.org/pgsql-hackers/2003-10/msg01505.php This failure is actually entirely pointless, because (AFAIK) any page that is brought in during WAL recovery is going to be overwritten in toto from the WAL log. So it would be safe to run WAL recovery with zero_damaged_pages enabled. Rather than expecting DBAs to think of that under the stress of a crashed-database situation, I propose that we do it for them: *** src/backend/storage/buffer/bufmgr.c.orig Fri Nov 21 12:41:31 2003 --- src/backend/storage/buffer/bufmgr.c Sat Nov 29 13:35:14 2003 *************** *** 231,237 **** if (status == SM_SUCCESS && !PageHeaderIsValid((PageHeader) MAKE_PTR(bufHdr->data))) { ! if (zero_damaged_pages) { ereport(WARNING, (errcode(ERRCODE_DATA_CORRUPTED), --- 231,237 ---- if (status == SM_SUCCESS && !PageHeaderIsValid((PageHeader) MAKE_PTR(bufHdr->data))) { ! if (zero_damaged_pages || InRecovery) { ereport(WARNING, (errcode(ERRCODE_DATA_CORRUPTED), Comments? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings