Re: [HACKERS] WAL consistency check facility

2016-08-31 Thread Simon Riggs
On 27 August 2016 at 12:09, Kuntal Ghosh wrote: >>> * wal_consistency_mask = 511 /* Enable consistency check mask bit*/ >> >> What does this mean? (No docs) > > I was using this parameter as a masking integer to indicate the > operations(rmgr list) for which we need

Re: [HACKERS] WAL consistency check facility

2016-08-28 Thread Kuntal Ghosh
Thank you. I've updated it accordingly. On Sun, Aug 28, 2016 at 11:20 AM, Peter Geoghegan wrote: > On Sat, Aug 27, 2016 at 9:47 PM, Amit Kapila wrote: >> Right, I think there is no need to mask all the flags. However apart >> from BTP_HAS_GARBAGE, it

Re: [HACKERS] WAL consistency check facility

2016-08-27 Thread Peter Geoghegan
On Sat, Aug 27, 2016 at 9:47 PM, Amit Kapila wrote: > Right, I think there is no need to mask all the flags. However apart > from BTP_HAS_GARBAGE, it seems we should mask BTP_SPLIT_END as that is > just used to save some processing for vaccum and won't be set after >

Re: [HACKERS] WAL consistency check facility

2016-08-27 Thread Amit Kapila
On Sun, Aug 28, 2016 at 6:26 AM, Peter Geoghegan wrote: > On Thu, Aug 25, 2016 at 9:41 AM, Kuntal Ghosh > wrote: >> 2. For Btree pages, I've masked BTP_HALF_DEAD, BTP_SPLIT_END, >> BTP_HAS_GARBAGE and BTP_INCOMPLETE_SPLIT flags. > > Why? I think that

Re: [HACKERS] WAL consistency check facility

2016-08-27 Thread Peter Geoghegan
On Thu, Aug 25, 2016 at 9:41 AM, Kuntal Ghosh wrote: > 2. For Btree pages, I've masked BTP_HALF_DEAD, BTP_SPLIT_END, > BTP_HAS_GARBAGE and BTP_INCOMPLETE_SPLIT flags. Why? I think that you should only perform this kind of masking where it's clearly strictly necessary.

Re: [HACKERS] WAL consistency check facility

2016-08-27 Thread Peter Geoghegan
On Fri, Aug 26, 2016 at 7:24 AM, Alvaro Herrera wrote: >> As the block numbers are different, I was getting the following warning: >> WARNING: Inconsistent page (at byte 8166) found for record >> 0/127F4A48, rel 1663/16384/16946, forknum 0, blkno 0, Backup Page >>

Re: [HACKERS] WAL consistency check facility

2016-08-27 Thread Michael Paquier
On Sat, Aug 27, 2016 at 6:16 PM, Simon Riggs wrote: > On 27 August 2016 at 07:36, Amit Kapila wrote: >> On Fri, Aug 26, 2016 at 9:26 PM, Simon Riggs wrote: >>> >>> I think you should add this as part of the default testing

Re: [HACKERS] WAL consistency check facility

2016-08-27 Thread Kuntal Ghosh
Hello Simon, I'm really sorry for the inconveniences. Next time, I'll attach the patch with proper documentation, test and comments. > I think you should add this as part of the default testing for both > check and installcheck. I can't imagine why we'd have it and not use > it during testing.

Re: [HACKERS] WAL consistency check facility

2016-08-27 Thread Simon Riggs
On 27 August 2016 at 07:36, Amit Kapila wrote: > On Fri, Aug 26, 2016 at 9:26 PM, Simon Riggs wrote: >> >> I think you should add this as part of the default testing for both >> check and installcheck. I can't imagine why we'd have it and not use

Re: [HACKERS] WAL consistency check facility

2016-08-26 Thread Amit Kapila
On Fri, Aug 26, 2016 at 9:26 PM, Simon Riggs wrote: > > I think you should add this as part of the default testing for both > check and installcheck. I can't imagine why we'd have it and not use > it during testing. > The actual consistency checks are done during redo

Re: [HACKERS] WAL consistency check facility

2016-08-26 Thread Simon Riggs
Hi Kuntal, Thanks for the patch. Current patch has no docs, no tests and no explanatory comments, so makes review quite hard. The good news is you might discover a few bugs with it, so its worth pursuing actively in this CF, though its not near to being committable. I think you should add this

Re: [HACKERS] WAL consistency check facility

2016-08-26 Thread Alvaro Herrera
Kuntal Ghosh wrote: > Thanks a lot. > > I just want to mention the situation where I was getting the > speculative token related inconsistency. > > ItemPointer in backup page from master: > LOG: ItemPointer BlockNumber: 1 OffsetNumber:65534 Speculative: true > CONTEXT: xlog redo at 0/127F4A48

Re: [HACKERS] WAL consistency check facility

2016-08-26 Thread Kuntal Ghosh
Thanks a lot. I just want to mention the situation where I was getting the speculative token related inconsistency. ItemPointer in backup page from master: LOG: ItemPointer BlockNumber: 1 OffsetNumber:65534 Speculative: true CONTEXT: xlog redo at 0/127F4A48 for Heap/INSERT+INIT: off 1

Re: [HACKERS] WAL consistency check facility

2016-08-25 Thread Alvaro Herrera
Kuntal Ghosh wrote: > 4. For Speculative Heap tuple insert operation, there was > inconsistency in t_ctid value. So, I've modified the t_ctid value (in > backup page) to current block number and offset number. Need > suggestions!! In speculative insertions, t_ctid is used to store the

Re: [HACKERS] WAL consistency check facility

2016-08-25 Thread Kuntal Ghosh
Hi, I've added the feature in CP app. Following are the testing details: 1. In master, I've enabled following configurations: * wal_level = replica * max_wal_senders = 3 * wal_keep_segments = 4000 * hot_standby = on * wal_consistency_mask = 511 /* Enable consistency check mask bit*/ 2. In

Re: [HACKERS] WAL consistency check facility

2016-08-24 Thread Simon Riggs
On 22 August 2016 at 16:56, Simon Riggs wrote: > On 22 August 2016 at 13:44, Kuntal Ghosh wrote: > >> Please let me know your thoughts on this. > > Do the regression tests pass with this option enabled? Hi, I'd like to be a reviewer on this.

Re: [HACKERS] WAL consistency check facility

2016-08-22 Thread Amit Kapila
On Tue, Aug 23, 2016 at 10:57 AM, Michael Paquier wrote: > > Also, what's the use case of allowing only a certain set of rmgrs to > be checked. Wouldn't a simple on/off switch be simpler? > I think there should be a way to test WAL for one particular resource manager.

Re: [HACKERS] WAL consistency check facility

2016-08-22 Thread Michael Paquier
On Tue, Aug 23, 2016 at 1:32 PM, Amit Kapila wrote: > On Mon, Aug 22, 2016 at 9:16 PM, Robert Haas wrote: >> On Mon, Aug 22, 2016 at 9:25 AM, Michael Paquier >> wrote: >>> Another pin-point is: given a certain page, how

Re: [HACKERS] WAL consistency check facility

2016-08-22 Thread Kuntal Ghosh
Yes, I've verified the outputs and log contents after running gmake installcheck and gmake installcheck-world. The status of the test was marked as pass for all the testcases. On Mon, Aug 22, 2016 at 9:26 PM, Simon Riggs wrote: > On 22 August 2016 at 13:44, Kuntal Ghosh

Re: [HACKERS] WAL consistency check facility

2016-08-22 Thread Amit Kapila
On Mon, Aug 22, 2016 at 9:16 PM, Robert Haas wrote: > On Mon, Aug 22, 2016 at 9:25 AM, Michael Paquier > wrote: >> Another pin-point is: given a certain page, how do we identify of >> which type it is? One possibility would be again to extend the

Re: [HACKERS] WAL consistency check facility

2016-08-22 Thread Simon Riggs
On 22 August 2016 at 13:44, Kuntal Ghosh wrote: > Please let me know your thoughts on this. Do the regression tests pass with this option enabled? -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training &

Re: [HACKERS] WAL consistency check facility

2016-08-22 Thread Robert Haas
On Mon, Aug 22, 2016 at 9:25 AM, Michael Paquier wrote: > Another pin-point is: given a certain page, how do we identify of > which type it is? One possibility would be again to extend the AM > handler with some kind of is_self function with a prototype like that: >

Re: [HACKERS] WAL consistency check facility

2016-08-22 Thread Michael Paquier
On Mon, Aug 22, 2016 at 9:44 PM, Kuntal Ghosh wrote: > Please let me know your thoughts on this. Since custom AMs have been introduced, I have kept that in a corner of my mind and thought about it a bit. And while the goal of this patch is clearly worth it, I don't

[HACKERS] WAL consistency check facility

2016-08-22 Thread Kuntal Ghosh
Hi, I've attached a patch to check if the current page is equal with the FPW after applying WAL on it. This is how the patch works: 1. When a WAL record is inserted, a FPW is done for that operation. But, a flag is kept to indicate whether that page needs to be restored. 2. During recovery,

<    1   2