On Tue, Feb 1, 2022 at 9:39 PM Fujii Masao <masao.fu...@oss.nttdata.com> wrote:
> I found that CreateRestartPoint() already reported the redo lsn as follows 
> after emitting the restartpoint log message. To avoid duplicated logging of 
> the same information, we should update this code?
>
>         ereport((log_checkpoints ? LOG : DEBUG2),
>                         (errmsg("recovery restart point at %X/%X",
>                                         LSN_FORMAT_ARGS(lastCheckPoint.redo)),
>                          xtime ? errdetail("Last completed transaction was at 
> log time %s.",
>                                                            
> timestamptz_to_str(xtime)) : 0));
>
> This code reports lastCheckPoint.redo as redo lsn. OTOH, with the patch, 
> LogCheckpointEnd() reports ControlFile->checkPointCopy.redo. They may be 
> different, for example, when the current DB state is not 
> DB_IN_ARCHIVE_RECOVERY. In this case, which lsn should we report as redo lsn?

Do we ever reach CreateRestartPoint when ControlFile->stat !=
DB_IN_ARCHIVE_RECOVERY? Assert(ControlFile->state ==
DB_IN_ARCHIVE_RECOVERY); in CreateRestartPoint doesn't fail any
regression tests.

Here's what can happen:
lastCheckPoint.redo is 100 and ControlFile->checkPointCopy.redo is
105, so, "skipping restartpoint, already performed at %X/%X"
LogCheckpointEnd isn't reached
lastCheckPoint.redo is 105 and ControlFile->checkPointCopy.redo is 100
and ControlFile->state == DB_IN_ARCHIVE_RECOVERY, then the control
file gets updated and LogCheckpointEnd prints the right redo lsn.
lastCheckPoint.redo is 105 and ControlFile->checkPointCopy.redo is 100
and ControlFile->state != DB_IN_ARCHIVE_RECOVERY, the the control file
doesn't get updated and LogCheckpointEnd just prints the control redo
lsn. Looks like this case is rare given Assert(ControlFile->state ==
DB_IN_ARCHIVE_RECOVERY); doesn't fail any tests.

I think we should just let LogCheckpointEnd print the redo lsn from
the control file. We can just remove the above errmsg("recovery
restart point at %X/%X" message altogether or just print it only in
the rare scenario, something like below:

if (ControlFile->state != DB_IN_ARCHIVE_RECOVERY)
{
    ereport((log_checkpoints ? LOG : DEBUG2),
            (errmsg("performed recovery restart point at %X/%X while
the database state is %s",
                    LSN_FORMAT_ARGS(lastCheckPoint.redo)),
getDBState(ControlFile->state)));
}

And the last commit/abort records's timestamp will always get logged
even before we reach here in the main redo loop (errmsg("last
completed transaction was at log time %s").

Or another way is to just pass the redo lsn to LogCheckpointEnd and
pass the lastCheckPoint.redo in if (ControlFile->state !=
DB_IN_ARCHIVE_RECOVERY) cases or when control file isn't updated but
restart point happened.

Thoughts?

Regards,
Bharath Rupireddy.


Reply via email to