On Thu, Nov 10, 2016 at 10:25 AM, Michael Paquier <michael.paqu...@gmail.com> wrote: > On Wed, Nov 9, 2016 at 11:32 PM, Kuntal Ghosh >> Thanks a lot for reviewing the patch. Based on your review, I've attached the
>> I've done a fair amount of testing which includes regression tests >> and manual creation of inconsistencies in the page. I've also found the >> reason behind inconsistency in brin revmap page. >> Brin revmap page doesn't have standard page layout. Besides, It doesn't >> update >> pd_upper and pd_lower in its operations as well. But, during WAL >> insertions, it uses >> REGBUF_STANDARD to register a reference in the WAL record. Hence, when we >> restore image before consistency check, RestoreBlockImage fills the space >> between pd_upper and pd_lower(page hole) with zero. I've posted this as a >> separate thread. >> https://www.postgresql.org/message-id/flat/CAGz5QCJ%3D00UQjScSEFbV%3D0qO5ShTZB9WWz_Fm7%2BWd83zPs9Geg%40mail.gmail.com#CAGz5QCJ=00UQjScSEFbV=0qo5shtzb9wwz_fm7+wd83zps9...@mail.gmail.com > > Nice to have spotted the inconsistency. This tool is going to be useful.. > > I have spent a couple of hours playing with the patch, and worked on > it a bit more with a couple of minor changes: > - In gindesc.c, the if blocks are more readable with brackets. > - Addition of a comment when info is set, to mention that this is done > at the beginning of the routine so as callers of XLogInsert() can pass > the flag for consistency checks. > - apply_image should be reset in ResetDecoder(). > - The BRIN problem is here: > 2016-11-10 12:24:10 JST [65776]: [23-1] db=,user=,app=,client= FATAL: > Inconsistent page found, rel 1663/16385/30625, forknum 0, blkno 1 > 2016-11-10 12:24:10 JST [65776]: [24-1] db=,user=,app=,client= > CONTEXT: xlog redo at 0/9BD31148 for BRIN/UPDATE+INIT: heapBlk 100 > pagesPerRange 1 old offnum 11, new offnum 1 > 2016-11-10 12:24:10 JST [65776]: [25-1] db=,user=,app=,client= > WARNING: buffer refcount leak: [4540] (rel=base/16385/30625, > blockNum=1, flags=0x93800000, refcount=1 1) > TRAP: FailedAssertion("!(RefCountErrors == 0)", File: "bufmgr.c", Line: 2506) > Now the buffer refcount leak is not normal! The safest thing to do > here is to release the buffer once a copy of it has been taken, and > the leaks goes away when calling FATAL to report the inconsistency. > - When checking for XLR_CHECK_CONSISTENCY, better to add a != 0 to get > a boolean out of it. > - guc_malloc() should be done as late as possible, this simplifies the > code and prevents any memory leaks which is what your patch is doing > when there is an error. (I have finally put my finger on what was > itching me here). > - In btree_mask, the lookup of BTP_DELETED can be deadly simplified. > - I wondered also about putting assign_wal_consistency and > check_wal_consistency in a separate file, say xlogparams.c, concluding > that the current code does nothing bad either even if it adds rmgr.h > in the list of headers in guc.c. > - Some comment blocks are longer than 72~80 characters. > - Typos! All the changes make perfect sense to me. > With the patch for BRIN applied, I am able to get installcheck-world > working with wal_consistency = all and a standby doing the consistency > checks behind. Adding wal_consistency = all in PostgresNode.pm, the > recovery tests are passing. This patch is switched as "Ready for > Committer". Thanks for completing this effort begun 3 years ago! Thanks to you for reviewing all the patches in so much detail. Amit, Robert and Dilip also helped me a lot in developing the feature. Thanks to them as well. -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers