On Fri, 2022-11-11 at 12:43 +0200, Konstantin Knizhnik wrote:
> Yes, you are right: my original concerns that it may cause problems
> with 
> recovery at replica are not correct.

Great, thank you for following up.

> I also not sure that it can cause problems with checksums - page is 
> marked as dirty in any case.
> Yes, content and checksum of the page will be different at master and
> replica. It may be a problem for recovery pages from replica.

It could cause a theoretical problem: during recovery, the page could
be flushed out before minRecoveryPoint is updated, and while that's
happening, a crash could tear it. Then, a subsequent partial recovery
(e.g. PITR) could recover without fixing the torn page.

That would not be a practical problem, because it would require a tear
between two fields of the page header, which I don't think is possible.

> When it really be critical - is incremental backup (like
> pg_probackup) 
> whch looks at page LSN to determine moment of page modification.
> 
> And definitely it is critical for Neon, because LSN of page 
> reconstructed by pageserver is different from one expected by compute
> node.
> 
> May be I missing something, but I do not see any penalty from setting
> page LSN here.
> So even if there is not obvious scenario of failure, I still thing
> that 
> it should be fixed. Breaking invariants is always bad thing
> and there are should be very strong arguments for doing it. And I do
> not 
> see them here.

Agreed, thank you for the report!

Committed d6a3dbe14f and backpatched through version 11.

Also committed an unrelated cleanup patch in the same area (3eb8eeccbe)
and a README patch (97c61f70d1) to master. The README patch doesn't
guarantee that things won't change in the future, but the behavior it
describes has been true for quite a while now.


-- 
Jeff Davis
PostgreSQL Contributor Team - AWS




Reply via email to