On Thu, Jun 26, 2025 at 6:22 AM Michael Paquier <mich...@paquier.xyz> wrote: > > On Wed, Jun 25, 2025 at 10:19:55PM +0530, vignesh C wrote: > > Currently, the logic attempts to read the complete WAL record based on > > the size obtained before the crash—even though only a partial record > > was written. It then checks the page header to determine whether the > > XLP_FIRST_IS_OVERWRITE_CONTRECORD flag is set only after reading the > > complete WAL record at XLogDecodeNextRecord function, but since that > > much WAL data was not available in the system we never get a chance to > > check the header after this.. To address this issue, a more robust > > approach would be to first read the page header, check if the > > XLP_FIRST_IS_OVERWRITE_CONTRECORD flag is set, and only then proceed > > to read the WAL record size if the record is not marked as a partial > > overwrite. This would prevent the system from waiting for WAL data > > that will never arrive. Attached partial_wal_record_fix.patch patch > > for this.
Yeah this is a problem, I am not sure at the moment I can think of anything better than just reading the header first and checking the XLP_FIRST_IS_OVERWRITE_CONTRECORD flag. > > So you are suggesting the addition of an extra ReadPageInternal() that > forces a read of only the read, perform the checks on the header, then > read the rest. After reading SizeOfXLogShortPHD worth of data, > shouldn't the checks on xlp_rem_len be done a bit earlier than what > you are proposing in this patch? I did not get the point, IMHO it has to be validated after the record on the next page has been read. -- Regards, Dilip Kumar Google