Re: Amcheck: do rightlink verification with lock coupling

2020-08-09 Thread Andrey M. Borodin



> 8 авг. 2020 г., в 23:14, Peter Geoghegan  написал(а):
> 
> I pushed a cleaned up version of this patch just now. I added some
> commentary about this canonical example in header comments for the new
> function.

Thanks for working on this!

Best regards, Andrey Borodin.



Re: Amcheck: do rightlink verification with lock coupling

2020-08-08 Thread Peter Geoghegan
On Thu, Aug 6, 2020 at 10:59 PM Andrey M. Borodin  wrote:
> But having complete solution with no false positives seems much better.

Agreed. I know that you didn't pursue this for no reason -- having the
check available makes bt_check_index() a lot more valuable in
practice. It detects what is actually a classic example of subtle
B-Tree corruption (left link corruption), which appears in Modern
B-Tree techniques in its discussion of corruption detection. It's
actually the canonical example of how B-Tree corruption can be very
subtle in the real world.

I pushed a cleaned up version of this patch just now. I added some
commentary about this canonical example in header comments for the new
function.

Thanks
--
Peter Geoghegan




Re: Amcheck: do rightlink verification with lock coupling

2020-08-07 Thread Andrey M. Borodin



> 6 авг. 2020 г., в 21:38, Peter Geoghegan  написал(а):
> 
> On Wed, Aug 5, 2020 at 9:50 PM Andrey M. Borodin  wrote:
>> Sounds great! Thanks!
> 
> I'm afraid that there is another problem, this time with
> btree_xlog_split(). It's possible to get false positives when running
> the new test continually on a standby. You can see this by running
> verification on a standby continually, while the primary runs with a
> workload that gets many page splits.
Yes, I see the problem...
> 
> The only thing that we can do is adjust the locking in
> btree_xlog_split() to match the primary (kind of like commit 9a9db08a,
> except with page splits instead of page deletion). Attached is a
> revised version of the patch, along with the changes that we'd need to
> REDO to make the amcheck patch really work.
> 
> I'm not sure if this change to the REDO routine is worth the overhead
> or trouble, though. I have to think about it some more.
If we want to check relations between pages we must either apply them together 
(under locks) or tolerate some fraction of false positives. I understand that 
mitigating and tolerating false positives is nonsense in mathematica sense, but 
from practical point of view it's just OK.

But having complete solution with no false positives seems much better.

> 
> BTW, the first patch in the series now has a new check for page
> deletion -- that was missing from v4.
Yes, seems like that was a bug..

Thanks!

Best regards, Andrey Borodin.





Re: Amcheck: do rightlink verification with lock coupling

2020-08-06 Thread Peter Geoghegan
On Wed, Aug 5, 2020 at 9:50 PM Andrey M. Borodin  wrote:
> Sounds great! Thanks!

I'm afraid that there is another problem, this time with
btree_xlog_split(). It's possible to get false positives when running
the new test continually on a standby. You can see this by running
verification on a standby continually, while the primary runs with a
workload that gets many page splits.

It's easy to see if you apply this patch:

--- a/src/backend/access/nbtree/nbtxlog.c
+++ b/src/backend/access/nbtree/nbtxlog.c
@@ -435,6 +435,9 @@ btree_xlog_split(bool newitemonleft,
XLogReaderState *record)
  UnlockReleaseBuffer(lbuf);
  UnlockReleaseBuffer(rbuf);

+ /* trick */
+ pg_usleep(10 * 1000L);
+

The only thing that we can do is adjust the locking in
btree_xlog_split() to match the primary (kind of like commit 9a9db08a,
except with page splits instead of page deletion). Attached is a
revised version of the patch, along with the changes that we'd need to
REDO to make the amcheck patch really work.

I'm not sure if this change to the REDO routine is worth the overhead
or trouble, though. I have to think about it some more.

BTW, the first patch in the series now has a new check for page
deletion -- that was missing from v4.

-- 
Peter Geoghegan


v5-0002-Adjust-btree_xlog_split-locking.patch
Description: Binary data


v5-0001-Add-amcheck-sibling-link-checks.patch
Description: Binary data


Re: Amcheck: do rightlink verification with lock coupling

2020-08-05 Thread Andrey M. Borodin



> 6 авг. 2020 г., в 04:25, Peter Geoghegan  написал(а):
> 
> * Added _bt_checkpage() calls for buffers, as is standard practice in nbtree.
> 
> * Added protection against locking the same page a second time in the
> event of a faulty sibling link -- we should avoid a self-deadlock in
> the event of a page that is corrupt in just the wrong way.
> 
> * Updated obsolescent comments that claimed that we never couple
> buffer locks in amcheck.
Cool, thanks!
There's mintor typo: missing space in "of_bt_check_unique".

> 
> I would like to commit something like this in the next day or two.
> 
> Thoughts?

Sounds great! Thanks!

Best regards, Andrey Borodin.





Re: Amcheck: do rightlink verification with lock coupling

2020-08-05 Thread Peter Geoghegan
On Tue, Aug 4, 2020 at 9:33 AM Andrey M. Borodin  wrote:
> BTW, reviewing this patch again I cannot understand why we verify link 
> coherence only on leaf level but not for internal pages?
> I do not see any actual problems here.

Well, I thought that it might be a good idea to limit it to the leaf
level, based on the theory that we rarely couple locks on internal
page levels in general. But yeah, that's probably not a good enough
reason to avoid lock coupling on internal pages. It's probably better
to do it everywhere than to explain why we don't do it on the internal
level -- the explanation will probably be confusing. And even if there
was a performance issue, it could only happen when there are
concurrent internal page splits -- but those are supposed to be rare.

Attached is v4, which now checks internal pages (just like leaf
pages). The main other change in this revised version is that we make
the error raised by bt_index_check() match the error used in the old
bt_index_parent_check() case -- we always want to blame the current
target page when amcheck complains (technically the page we blame when
the invariant fails isn't strictly guaranteed to be quite the same
thing as the target, but it's close enough to not really matter in
reality). Other adjustments:

* Added _bt_checkpage() calls for buffers, as is standard practice in nbtree.

* Added protection against locking the same page a second time in the
event of a faulty sibling link -- we should avoid a self-deadlock in
the event of a page that is corrupt in just the wrong way.

* Updated obsolescent comments that claimed that we never couple
buffer locks in amcheck.

I would like to commit something like this in the next day or two.

Thoughts?

--
Peter Geoghegan


v4-0001-Add-amcheck-sibling-link-checks.patch
Description: Binary data


Re: Amcheck: do rightlink verification with lock coupling

2020-08-04 Thread Andrey M. Borodin


> 4 авг. 2020 г., в 03:44, Peter Geoghegan  написал(а):
> 
> On Mon, Jul 20, 2020 at 11:46 AM Andrey M. Borodin  
> wrote:
>> In this thread [0] we decided that lock coupling is necessary for 
>> btree_xlog_unlink_page().
>> So, maybe let's revive this patch?
> 
> Yes, let's do that. Can you resubmit it, please?

PFA v3.
Changes: fixed few typos in comments.

BTW, reviewing this patch again I cannot understand why we verify link 
coherence only on leaf level but not for internal pages?
I do not see any actual problems here.

Corruption detection power of leftlink-rightlinks on internal pages is 
diminishingly small compared to leaf pages. But there seems to be no reason to 
exclude internal pages?

Thanks!

Best regards, Andrey Borodin.


v3-0001-In-amcheck-nbtree-do-rightlink-verification-with-.patch
Description: Binary data


Re: Amcheck: do rightlink verification with lock coupling

2020-08-03 Thread Peter Geoghegan
On Mon, Jul 20, 2020 at 11:46 AM Andrey M. Borodin  wrote:
> In this thread [0] we decided that lock coupling is necessary for 
> btree_xlog_unlink_page().
> So, maybe let's revive this patch?

Yes, let's do that. Can you resubmit it, please?


Peter Geoghegan




Re: Amcheck: do rightlink verification with lock coupling

2020-07-20 Thread Andrey M. Borodin
Hi!

> 23 янв. 2020 г., в 00:59, Peter Geoghegan  написал(а):
> 
> On Fri, Jan 17, 2020 at 5:43 PM Peter Geoghegan  wrote:
>> I tried to come up with a specific example of how this could be
>> unsafe, but my explanation was all over the place (this could have had
>> something to do with it being Friday evening). Even still, it's up to
>> the patch to justify why it's safe, and that seems even more
>> difficult.
> 
> I can't see a way around this problem, so I'm marking the patch rejected.

In this thread [0] we decided that lock coupling is necessary for 
btree_xlog_unlink_page().
So, maybe let's revive this patch?

Thanks!

Best regards, Andrey Borodin.

[0] 
https://www.postgresql.org/message-id/flat/CAH2-Wzm7T_O%2BVUeo7%3D_NGPncs08z3JEybEwVLZGaASnbfg5vDA%40mail.gmail.com#a4ef597251fed0eb5c2896937bdbd0cc



Re: Amcheck: do rightlink verification with lock coupling

2020-01-22 Thread Peter Geoghegan
On Fri, Jan 17, 2020 at 5:43 PM Peter Geoghegan  wrote:
> I tried to come up with a specific example of how this could be
> unsafe, but my explanation was all over the place (this could have had
> something to do with it being Friday evening). Even still, it's up to
> the patch to justify why it's safe, and that seems even more
> difficult.

I can't see a way around this problem, so I'm marking the patch rejected.

Thanks
-- 
Peter Geoghegan




Re: Amcheck: do rightlink verification with lock coupling

2020-01-17 Thread Peter Geoghegan
On Thu, Jan 16, 2020 at 5:11 PM Peter Geoghegan  wrote:
> I find this argument convincing. I'll try to get this committed soon.
>
> While you could have used bt_index_parent_check() or heapallindexed to
> detect the issue, those two options are a lot more expensive (plus the
> former option won't work on a standby). Relaxing the principle that
> says that we shouldn't hold multiple buffer locks concurrently doesn't
> seem like that much to ask for to get such a clear benefit.

Having looked into it some more, I now have my doubts about this
patch. REDO routine code like btree_xlog_split() and
btree_xlog_unlink_page() feel entitled to only lock one page at a
time, which invalidates the assumption that we can couple locks on the
leaf level to verify mutual agreement in left and right sibling links
(with only an AccessShareLock on bt_index_check()'s target index
relation). It would definitely be safe for bt_index_check() to so this
were it not running in recovery mode, but that doesn't seem very
useful on its own.

I tried to come up with a specific example of how this could be
unsafe, but my explanation was all over the place (this could have had
something to do with it being Friday evening). Even still, it's up to
the patch to justify why it's safe, and that seems even more
difficult.

--
Peter Geoghegan




Re: Amcheck: do rightlink verification with lock coupling

2020-01-16 Thread Peter Geoghegan
On Mon, Jan 13, 2020 at 8:47 PM Andrey Borodin  wrote:
> > 11 янв. 2020 г., в 7:49, Peter Geoghegan  написал(а):
> > I'm curious why Andrey's corruption problems were not detected by the
> > cross-page amcheck test, though. We compare the first non-pivot tuple
> > on the right sibling leaf page with the last one on the target page,
> > towards the end of bt_target_page_check() -- isn't that almost as good
> > as what you have here in practice? I probably would have added
> > something like this myself earlier, if I had reason to think that
> > verification would be a lot more effective that way.
>
> We were dealing with corruption caused by lost page update. Consider two 
> pages:
> A->B
> If A is split into A` and C we get:
> A`->C->B
> But if update of A is lost we still have
> A->B, but B backward pointers points to C.
> B's smallest key is bigger than hikey of A`, but this do not violate
> cross-pages invariant.
>
> Page updates may be lost due to bug in backup software with incremental
> backups, bug in storage layer of Aurora-style system, bug in page cache, 
> incorrect
> fsync error handling, bug in ssd firmware etc. And our data checksums do not
> detect this kind of corruption. BTW I think that it would be better if our
> checksums were not stored on a page itseft, they could detect this kind of 
> faults.

I find this argument convincing. I'll try to get this committed soon.

While you could have used bt_index_parent_check() or heapallindexed to
detect the issue, those two options are a lot more expensive (plus the
former option won't work on a standby). Relaxing the principle that
says that we shouldn't hold multiple buffer locks concurrently doesn't
seem like that much to ask for to get such a clear benefit.

I think that this is safe, but page deletion/half-dead pages need more
thought. In general, the target page could have become "ignorable"
when rechecked.

> We were able to timely detect all those problems on primaries in our testing
> environment. But much later found out that some standbys were corrupted,
> the problem appeared only when they were promoted.
> Also, in nearby thread Grygory Rylov (g0djan) is trying to enable one more
> invariant in standby checks.

I looked at that thread just now, but Grygory didn't say why this true
root check was particularly important, so I can't see much upside.
Plus that seems riskier than what you have in mind here.

Does it have something to do with the true root looking like a deleted
page? The details matter.

-- 
Peter Geoghegan




Re: Amcheck: do rightlink verification with lock coupling

2020-01-16 Thread Andrey Borodin



> 14 янв. 2020 г., в 9:47, Andrey Borodin  написал(а):
> 
> Page updates may be lost due to bug in backup software with incremental 
> backups, bug in storage layer of Aurora-style system, bug in page cache, 
> incorrect
> fsync error handling, bug in ssd firmware etc. And our data checksums do not
> detect this kind of corruption. BTW I think that it would be better if our
> checksums were not stored on a page itseft, they could detect this kind of 
> faults.

Observed it just now.
There is one HA cluster where a node was marked dead. This node was 
disconnected from cluster, but due to human error there was postgres running.
Node managed to install block-level incremental backup to the chain. And backup 
software did not detect that backup step was taken from part of timeline that 
was not in actual timeline's history.
Result of restoration is:

man-w%/%db R # select bt_index_check('%.pk_%');
 bt_index_check 

 
(1 row)

Time: 1411.065 ms (00:01.411)
man-w%/%db R # select patched_index_check('%.pk_%');
ERROR:  XX002: left link/right link pair in index "pk_labels" not in agreement
DETAIL:  Block=42705 left block=42707 left link from block=45495.
LOCATION:  bt_recheck_block_rightlink, verify_nbtree.c:621
Time: 671.336 ms

('%' is replacing removed chars)

I understand that this corruption was not introduced by postgres itself, but by 
combination of bug in two 3rd party tools and human error.
But I can imagine similar corruptions with different root causes.

Best regards, Andrey Borodin.



Re: Amcheck: do rightlink verification with lock coupling

2020-01-13 Thread Andrey Borodin
Hi Peter! Sorry for answering so long.

> 11 янв. 2020 г., в 7:49, Peter Geoghegan  написал(а):
> 
> I'm curious why Andrey's corruption problems were not detected by the
> cross-page amcheck test, though. We compare the first non-pivot tuple
> on the right sibling leaf page with the last one on the target page,
> towards the end of bt_target_page_check() -- isn't that almost as good
> as what you have here in practice? I probably would have added
> something like this myself earlier, if I had reason to think that
> verification would be a lot more effective that way.

We were dealing with corruption caused by lost page update. Consider two pages:
A->B
If A is split into A` and C we get:
A`->C->B
But if update of A is lost we still have
A->B, but B backward pointers points to C.
B's smallest key is bigger than hikey of A`, but this do not violate 
cross-pages invariant.

Page updates may be lost due to bug in backup software with incremental 
backups, bug in storage layer of Aurora-style system, bug in page cache, 
incorrect
fsync error handling, bug in ssd firmware etc. And our data checksums do not
detect this kind of corruption. BTW I think that it would be better if our
checksums were not stored on a page itseft, they could detect this kind of 
faults.

We were able to timely detect all those problems on primaries in our testing
environment. But much later found out that some standbys were corrupted,
the problem appeared only when they were promoted.
Also, in nearby thread Grygory Rylov (g0djan) is trying to enable one more
invariant in standby checks.


Thanks!

Best regards, Andrey Borodin.



Re: Amcheck: do rightlink verification with lock coupling

2020-01-13 Thread Tomas Vondra

On Mon, Jan 13, 2020 at 03:49:40PM -0800, Peter Geoghegan wrote:

On Sat, Jan 11, 2020 at 4:25 AM Tomas Vondra
 wrote:

Understood. Is that a reason to not commit of this patch now, though?


It could use some polishing. Are you interested in committing it?



Not really - as a CFM I was trying to revive patches that seem in good
shape but not moving.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: Amcheck: do rightlink verification with lock coupling

2020-01-13 Thread Peter Geoghegan
On Sat, Jan 11, 2020 at 4:25 AM Tomas Vondra
 wrote:
> Understood. Is that a reason to not commit of this patch now, though?

It could use some polishing. Are you interested in committing it?

-- 
Peter Geoghegan




Re: Amcheck: do rightlink verification with lock coupling

2020-01-11 Thread Tomas Vondra

On Fri, Jan 10, 2020 at 06:49:33PM -0800, Peter Geoghegan wrote:

On Fri, Jan 10, 2020 at 5:45 PM Tomas Vondra
 wrote:

Peter, any opinion on this proposed amcheck patch? In the other thread
[1] you seemed to agree this is worth checking, and Alvaro's proposal to
make this check optional seems like a reasonable compromise with respect
to the locking.


It's a good idea, and it probably doesn't even need to be made
optional -- lock coupling to the right is safe on a primary, and
should also be safe on standbys (though I should triple check the REDO
routines to be sure). The patch only does lock coupling when it proves
necessary, which ought to only happen when there is a concurrent page
split, which ought to be infrequent. Maybe there is no need to
compromise.



OK, that makes sense.


I'm curious why Andrey's corruption problems were not detected by the
cross-page amcheck test, though. We compare the first non-pivot tuple
on the right sibling leaf page with the last one on the target page,
towards the end of bt_target_page_check() -- isn't that almost as good
as what you have here in practice? I probably would have added
something like this myself earlier, if I had reason to think that
verification would be a lot more effective that way.

To be clear, I believe that Andrey wrote this patch for a reason -- I
assume that it makes a noticeable and consistent difference. I would
like to gain a better understanding of why that was for my own
benefit, though. For example, it might be that page deletion was a
factor that made the test I mentioned less effective. I care about the
specifics.



Understood. Is that a reason to not commit of this patch now, though?


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: Amcheck: do rightlink verification with lock coupling

2020-01-10 Thread Tomas Vondra

On Thu, Sep 12, 2019 at 10:16:12AM -0300, Alvaro Herrera wrote:

On 2019-Sep-12, Andrey Borodin wrote:


This patch violates one of amcheck design principles: current code
does not ever take more than one page lock. I do not know: should we
hold this rule or should we use more deep check?


The check does seem worthwhile to me.

As far as I know, in btree you can lock the right sibling of a page
while keeping lock on the page itself, without risk of deadlock.  So I'm
not sure what's the issue with that.  This is in the README:

 In most cases we release our lock and pin on a page before attempting
 to acquire pin and lock on the page we are moving to.  In a few places
 it is necessary to lock the next page before releasing the current one.
 This is safe when moving right or up, but not when moving left or down
 (else we'd create the possibility of deadlocks).

I suppose Peter was concerned about being able to run amcheck without
causing any trouble at all for concurrent operation; maybe we can retain
that property by making this check optional.



Peter, any opinion on this proposed amcheck patch? In the other thread
[1] you seemed to agree this is worth checking, and Alvaro's proposal to
make this check optional seems like a reasonable compromise with respect
to the locking.

[1] 
https://www.postgresql.org/message-id/flat/da9b33ac-53cb-4643-96d4-7a0bbc037...@yandex-team.ru

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: Amcheck: do rightlink verification with lock coupling

2019-09-12 Thread Alvaro Herrera
On 2019-Sep-12, Andrey Borodin wrote:

> This patch violates one of amcheck design principles: current code
> does not ever take more than one page lock. I do not know: should we
> hold this rule or should we use more deep check?

The check does seem worthwhile to me.

As far as I know, in btree you can lock the right sibling of a page
while keeping lock on the page itself, without risk of deadlock.  So I'm
not sure what's the issue with that.  This is in the README:

  In most cases we release our lock and pin on a page before attempting
  to acquire pin and lock on the page we are moving to.  In a few places
  it is necessary to lock the next page before releasing the current one.
  This is safe when moving right or up, but not when moving left or down
  (else we'd create the possibility of deadlocks).

I suppose Peter was concerned about being able to run amcheck without
causing any trouble at all for concurrent operation; maybe we can retain
that property by making this check optional.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Amcheck: do rightlink verification with lock coupling

2019-09-12 Thread Andrey Borodin
Hi!

This is a thread to discuss amcheck feature started in other thread[0].

Currently amcheck is scanning every B-tree level. If verification is done with 
ShareLock - amcheck will test that each page leftlink is pointing to page with 
rightlink backwards.
This is important invariant, in our experience it proved to be good at 
detecting various corruptions.
But this invariant can be detected only if both pages are not modified (e.g. 
split concurrently).

PFA patch, that in case of suspicion locks two pages and retests that 
invariant. This allows detection of several corruptions on standby.

This patch violates one of amcheck design principles: current code does not 
ever take more than one page lock. I do not know: should we hold this rule or 
should we use more deep check?

Thanks!

Best regards, Andrey Borodin.

[0] 
https://www.postgresql.org/message-id/flat/da9b33ac-53cb-4643-96d4-7a0bbc037...@yandex-team.ru


v2-0001-In-amcheck-nbtree-do-rightlink-verification-with-loc.patch
Description: Binary data