Re: Add checkpoint and redo LSN to LogCheckpointEnd log message

Kyotaro Horiguchi Tue, 08 Feb 2022 00:40:52 -0800

Mmm.. checkpoint and checkpointer are quite confusing in this context..

At Tue, 08 Feb 2022 16:58:22 +0900 (JST), Kyotaro Horiguchi 
<[email protected]> wrote in 
> At Mon, 7 Feb 2022 13:51:31 +0900, Fujii Masao <[email protected]> 
> wrote in 
> > 
> > 
> > On 2022/02/07 12:02, Kyotaro Horiguchi wrote:
> > > - If any later checkpoint/restartpoint has been established, just skip
> > >    remaining task then return false. (!chkpt_was_latest)
> > >    (I'm not sure this can happen, though.)
> > > - we update control file only when archive recovery is still ongoing.
> > 
> > This comment seems to conflict with what your PoC patch does. Because
> > with the patch, ControlFile->checkPoint and
> > ControlFile->checkPointCopy seem to be updated even when
> > ControlFile->state != DB_IN_ARCHIVE_RECOVERY.
> 
> Ah, yeah, by "update" I meant that "move forward". Sorry for confusing
> wording.


Please ignore the "that".

> > I agree with what your PoC patch does for now. When we're not in
> > archive recovery state, checkpoint and REDO locations in pg_control
> > should be updated but min recovery point should be reset to invalid
> > one (which instruments that subsequent crash recovery should replay
> > all available WAL files).
> 
> Yes.  All buffers before the last recovery point's end have been
> flushed out so the recovery point is valid as a checkpoint.  On the
> other hand minRecoveryPoint is no longer needed and actually is
> ignored at the next crash recovery.  We can leave it alone but it is
> consistent that it is cleared.
> 
> > > - Otherwise reset minRecoveryPoint then continue.
> > > Do you have any thoughts or opinions?
> > 
> > Regarding chkpt_was_latest, whether the state is
> > DB_IN_ARCHIVE_RECOVERY or not, if checkpoint and redo locations in
> > pg_control are updated, IMO we don't need to skip the "remaining
> > tasks". Since those locations are updated and subsequent crash
> > recovery will start from that redo location, for example, ISTM that we
> > can safely delete old WAL files prior to the redo location as the
> > "remaining tasks". Thought?
> 
> If I read you correctly, the PoC works that way. It updates pg_control
> if the restart point is latest then performs the remaining cleanup
> tasks in that case. Recovery state doesn't affect this process.
> 
> I reexamined about the possibility of concurrent checkpoints.
> 
> Both CreateCheckPoint and CreateRestartPoint are called from
> checkpointer loop, shutdown handler of checkpointer and standalone
> process.  So I can't see a possibility of concurrent checkpoints.
> 
> In the past we had a time when startup process called CreateCheckPoint

- directly in the crash recovery case where checkpoint is not running
- but since 7ff23c6d27 checkpoint is started before startup process
+ directly in the crash recovery case where checkpointer is not running
+ but since 7ff23c6d27 checkpointer is launched before startup process

> starts.  So I conclude that that cannot happen.
> 
> So the attached takes away the path for the case where the restart
> point is overtaken by a concurrent checkpoint.
> 
> Thus.. the attached removes the ambiguity of of the proposed patch
> about the LSNs in the restartpoint-ending log message.

Thoughts?

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Add checkpoint and redo LSN to LogCheckpointEnd log message

Reply via email to