> On 24 Dec 2024, at 01:46, Peter Geoghegan <[email protected]> wrote:
>
> On Wed, Nov 20, 2024 at 4:41 AM Andrey M. Borodin <[email protected]>
> wrote:
>>> On 15 Nov 2024, at 21:33, Peter Geoghegan <[email protected]> wrote:
>>> I propose this for the master branch only.
>>
>> The change seems correct to me: anyway cycle must be less than cycle of any
>> future vacuum after promotion.
>
> The cycles set in the page special area during page splits that happen
> to run while a VACUUM also runs must use that same VACUUM's cycle ID
> (which is stored in shared memory for the currently running VACUUM).
> That way the VACUUM will know when it must backtrack later on, to
> avoid missing index tuples that it is expected to remove.
>
> It doesn't matter if the cycle_id that VACUUM sees is less than or
> greater than its own one -- only that it matches its own one when it
> needs to match to get correct behavior from VACUUM. (Though it's also
> possible to get a false positive, in rare cases where we get unlucky
> and there's a collision. This might waste cycles within VACUUM, but
> it shouldn't lead to truly incorrect behavior.)
I'm thinking more about it. We always reset btpo_cycleid even in redo of a
split.
This "btpo_cycleid = 0;" reset can break two scenarios that are not currently
supported by us, but might be supported in future.
This reset is based on the idea that crash recovery will interrupt vacuum. It
is not true in these cases.
1. We are dealing with compute-storage separation system. We do not have
filesystem and when we need to read a page we get it from some storage service,
that rebuild pages from WAL. (e.g. Aurora and Neon) If we split a page during
vacuum, evict it and read it from service - we will miss needed backtrack to
the left page...
2. There's a tool for repairing pages with checksum violations - page repair.
AFAIK it can request page from Standby, and if it does amidst vacuum, vacuum
can get false negative for backtracking logic.
Thanks!
Best regards, Andrey Borodin.