Hi, I noticed an assumption [1] at WALRead() call sites expecting the flushed WAL page to be zero-padded after the flush LSN. I think this can't always be true as the WAL can get flushed after determining the flush LSN before reading it from the WAL file using WALRead(). I've hacked the code up a bit to check if that's true - https://github.com/BRupireddy2/postgres/tree/ensure_extra_read_WAL_page_is_zero_padded_at_the_end_WIP, the tests hit the Assert(false); added. Which means, the zero-padding comment around WALRead() call sites isn't quite right.
I'm wondering why the WALRead() callers are always reading XLOG_BLCKSZ despite knowing exactly how much to read. Is it to tell the OS to explicitly fetch the whole page from the disk? If yes, the OS will do that anyway because the page transfers from disk to OS page cache are always in terms of disk block sizes, no? Although, there's no immediate problem with it right now, the assumption is going to be incorrect when reading WAL from WAL buffers using WALReadFromBuffers - https://www.postgresql.org/message-id/CALj2ACV=C1GZT9XQRm4iN1NV1T=hla_hsgwnx2y5-g+mswd...@mail.gmail.com. If we have no reason, can the WALRead() callers just read how much they want like walsender for physical replication? Attached a patch for the change. Thoughts? [1] /* * Even though we just determined how much of the page can be validly read * as 'count', read the whole page anyway. It's guaranteed to be * zero-padded up to the page boundary if it's incomplete. */ if (!WALRead(state, cur_page, targetPagePtr, XLOG_BLCKSZ, tli, &errinfo)) -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
v1-0001-Do-away-with-zero-padding-assumption-before-WALRe.patch
Description: Binary data