On Wed, Oct 13, 2021 at 2:01 AM Amul Sul <sula...@gmail.com> wrote: > Instead of abortedRecPtr point, isn't enough to write > overwrite-contrecord at XLogCtl->lastReplayedEndRecPtr? I think both > are pointing to the same location then can't we use > lastReplayedEndRecPtr instead of abortedRecPtr to write > overwrite-contrecord and remove need of extra global variable, like > attached?
I think you mean missingContrecPtr, not abortedRecPtr. If I understand correctly, abortedRecPtr is going to be the location in some WAL segment which we replayed where a long record began, but missingContrecPtr seems like it would have to point to the beginning of the first segment we were unable to find to continue replay; and thus it ought to be the same as lastReplayedEndRecPtr. But the committed code doesn't seem to check that these are the same or verify the relationship between them in any way, so I'm worried there is some other case here. The comments in XLogReadRecord also suggest this: * We get here when a record that spans multiple pages needs to be * assembled, but something went wrong -- perhaps a contrecord piece * was lost. If caller is WAL replay, it will know where the aborted Saying that "perhaps" a contrecord piece was lost seems to imply that other explanations are possible as well, but I'm not sure what. -- Robert Haas EDB: http://www.enterprisedb.com