On Mon, Jan 13, 2020 at 8:47 PM Andrey Borodin <x4...@yandex-team.ru> wrote: > > 11 янв. 2020 г., в 7:49, Peter Geoghegan <p...@bowt.ie> написал(а): > > I'm curious why Andrey's corruption problems were not detected by the > > cross-page amcheck test, though. We compare the first non-pivot tuple > > on the right sibling leaf page with the last one on the target page, > > towards the end of bt_target_page_check() -- isn't that almost as good > > as what you have here in practice? I probably would have added > > something like this myself earlier, if I had reason to think that > > verification would be a lot more effective that way. > > We were dealing with corruption caused by lost page update. Consider two > pages: > A->B > If A is split into A` and C we get: > A`->C->B > But if update of A is lost we still have > A->B, but B backward pointers points to C. > B's smallest key is bigger than hikey of A`, but this do not violate > cross-pages invariant. > > Page updates may be lost due to bug in backup software with incremental > backups, bug in storage layer of Aurora-style system, bug in page cache, > incorrect > fsync error handling, bug in ssd firmware etc. And our data checksums do not > detect this kind of corruption. BTW I think that it would be better if our > checksums were not stored on a page itseft, they could detect this kind of > faults.
I find this argument convincing. I'll try to get this committed soon. While you could have used bt_index_parent_check() or heapallindexed to detect the issue, those two options are a lot more expensive (plus the former option won't work on a standby). Relaxing the principle that says that we shouldn't hold multiple buffer locks concurrently doesn't seem like that much to ask for to get such a clear benefit. I think that this is safe, but page deletion/half-dead pages need more thought. In general, the target page could have become "ignorable" when rechecked. > We were able to timely detect all those problems on primaries in our testing > environment. But much later found out that some standbys were corrupted, > the problem appeared only when they were promoted. > Also, in nearby thread Grygory Rylov (g0djan) is trying to enable one more > invariant in standby checks. I looked at that thread just now, but Grygory didn't say why this true root check was particularly important, so I can't see much upside. Plus that seems riskier than what you have in mind here. Does it have something to do with the true root looking like a deleted page? The details matter. -- Peter Geoghegan