On Fri, Jul 26, 2024 at 10:01 PM Robert Haas <[email protected]> wrote: > Wait for WAL summarization to catch up before creating .partial file. > > When a standby is promoted, CleanupAfterArchiveRecovery() may decide > to rename the final WAL file from the old timeline by adding ".partial" > to the name. If WAL summarization is enabled and this file is renamed > before its partial contents are summarized, WAL summarization breaks: > the summarizer gets stuck at that point in the WAL stream and just > errors out. > > To fix that, first make the startup process wait for WAL summarization > to catch up before renaming the file. Generally, this should be quick, > and if it's not, the user can shut off summarize_wal and try again. > To make this fix work, also teach the WAL summarizer that after a > promotion has occurred, no more WAL can appear on the previous > timeline: previously, the WAL summarizer wouldn't switch to the new > timeline until we actually started writing WAL there, but that meant > that when the startup process was waiting for the WAL summarizer, it > was waiting for an action that the summarizer wasn't yet prepared to > take. > > In the process of fixing these bugs, I realized that the logic to wait > for WAL summarization to catch up was spread out in a way that made > it difficult to reuse properly, so this code refactors things to make > it easier. > > Finally, add a test case that would have caught this bug and the > previously-fixed bug that WAL summarization sometimes needs to back up > when the timeline changes.
It appears that I was late with my review [1]. But the new tap test could still use pgperltidy. Links. 1. https://www.postgresql.org/message-id/CAPpHfduW3du0W%3D3noztdaJ6evGP9gqT1AGk_rwXrqDyus1zZoQ%40mail.gmail.com ------ Regards, Alexander Korotkov Supabase
