On Thu, Feb 9, 2017 at 5:56 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Wed, Feb 8, 2017 at 1:25 AM, Kuntal Ghosh <kuntalghosh.2...@gmail.com> 
> wrote:
>> Thank you Robert for the review. Please find the updated patch in the
>> attachment.
> I have committed this patch after fairly extensive revisions:

Cool. I had finally a look at what has been committed in a507b869.
Running regression tests with all RMGRs enabled, a single installcheck
generates 7GB of WAL. Woah.

> * Rewrote the documentation to give some idea what the underlying
> mechanism of operation of the feature is, so that users who choose to
> enable this will hopefully have some understanding of what they've
> turned on.

Thanks, those look good to me.

> * Renamed "char *page" arguments to "char *pagedata" and "Page page",
> because tempPage doesn't seem to be to be any better a name than
> page_norm.

> * Removed assertion in checkXLogConsistency that consistency checking
> is enabled for the RM; that's trivially false if
> wal_consistency_checking is not the same on the master and the
> standby.  (Note that quite apart from the issue of whether this
> function should exist at all, adding it to a header file after the
> closing #endif guard is certainly not right.)

I recall doing those two things the same way as in the commit. Not
sure at which point they have been re-introduced.

> * Changed checkXLogConsistency to skip pages whose LSN is newer than
> that of the record.  Without this, if you shut down recovery and
> restart it, it complains of inconsistent pages and dies.  (I'm not
> sure this is the only scenario that needs to be covered; it would be
> smart to do more testing of restarting the standby.)

Good point.

> -- a WAL consistency failure causes your standby to die a hard death.
> (Maybe there should be a way to suppress consistency checking on the
> standby -- but I think not just by requiring wal_consistency_checking
> on both ends.  Or maybe we should just downgrade the FATAL to WARNING;
> blowing up the standby irrevocably seems like poor behavior.)

Having a FATAL is useful for buildfarm members, that would show up in
red. Having a switch to generate a warning would be useful for live
deployments I agree. Now I think that we need as well two things:
- A recovery test to run regression tests with a standby behind.
- Extend the TAP tests so as it is possible to fill in postgresql.conf
with custom variables.
- have the buildfarm client run recovery tests!
I am fine to write those patches.

> I also bumped XLOG_PAGE_MAGIC (which is normally done by the
> committer, not the patch author, so I wasn't expecting that to be in
> the patch as submitted).

Here are a couple of things I have noticed while looking at the code.

+ * Portions Copyright (c) 2016, PostgreSQL Global Development Group
s/2016/2017/ in bufmask.c and bufmask.h.

+       if (ItemIdIsNormal(iid))
+       {
+           HeapTupleHeader page_htup = (HeapTupleHeader) page_item;
Unnecessary newline here.

+        * Read the contents from the backup copy, stored in WAL record and
+        * store it in a temporary page. There is not need to allocate a new
+        * page here, a local buffer is fine to hold its contents and a mask
+        * can be directly applied on it.
s/not need/no need/.

In checkXLogConsistency(), FPWs that have the flag BKPIMAGE_APPLY set
will still be checked, resulting in a FPW being compared to itself. I
think that those had better be bypassed.

Please find attached a patch with those fixes.

Attachment: consistency-checks-fix.patch
Description: Binary data

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to