Re: Add missing error codes to PANIC/FATAL error reports in xlogrecovery

Andres Freund Thu, 30 Nov 2023 11:48:01 -0800

Hi,

On 2023-11-30 10:54:12 -0800, Krishnakumar R wrote:
> diff --git a/src/backend/access/transam/xlogrecovery.c 
> b/src/backend/access/transam/xlogrecovery.c
> index c61566666a..2f50928e7e 100644
> --- a/src/backend/access/transam/xlogrecovery.c
> +++ b/src/backend/access/transam/xlogrecovery.c
> @@ -630,7 +630,8 @@ InitWalRecovery(ControlFileData *ControlFile, bool 
> *wasShutdown_ptr,
>                               if (!ReadRecord(xlogprefetcher, LOG, false,
>                                                               
> checkPoint.ThisTimeLineID))
>                                       ereport(FATAL,
> -                                                     (errmsg("could not find 
> redo location referenced by checkpoint record"),
> +                                                     
> (errcode(ERRCODE_DATA_CORRUPTED),
> +                                                      errmsg("could not find 
> redo location referenced by checkpoint record"),
>                                                        errhint("If you are 
> restoring from a backup, touch \"%s/recovery.signal\" or 
> \"%s/standby.signal\" and add required recovery options.\n"
>                                                                        "If 
> you are not restoring from a backup, try removing the file 
> \"%s/backup_label\".\n"
>                                                                        "Be 
> careful: removing \"%s/backup_label\" will result in a corrupt cluster if 
> restoring from a backup.",


Wondering if we should add a ERRCODE_CLUSTER_CORRUPTED for cases like this. We
have ERRCODE_DATA_CORRUPTED and ERRCODE_INDEX_CORRUPTED, which make
ERRCODE_DATA_CORRUPTED feel a bit too specific in this kind of situation?

OTOH, just having anything other than ERRCODE_INTERNAL_ERROR is better.


> @@ -640,7 +641,8 @@ InitWalRecovery(ControlFileData *ControlFile, bool 
> *wasShutdown_ptr,
>               else
>               {
>                       ereport(FATAL,
> -                                     (errmsg("could not locate required 
> checkpoint record"),
> +                                     (errcode(ERRCODE_DATA_CORRUPTED),
> +                                      errmsg("could not locate required 
> checkpoint record"),
>                                        errhint("If you are restoring from a 
> backup, touch \"%s/recovery.signal\" or \"%s/standby.signal\" and add 
> required recovery options.\n"
>                                                        "If you are not 
> restoring from a backup, try removing the file \"%s/backup_label\".\n"
>                                                        "Be careful: removing 
> \"%s/backup_label\" will result in a corrupt cluster if restoring from a 
> backup.",

Another aside: Isn't the hint here obsolete since we've removed exclusive
backups? I can't think of any scenario now where removing backup_label would
be correct in a non-exclusive backup.


> @@ -817,7 +820,8 @@ InitWalRecovery(ControlFileData *ControlFile, bool 
> *wasShutdown_ptr,
>                */
>               switchpoint = 
> tliSwitchPoint(ControlFile->checkPointCopy.ThisTimeLineID, expectedTLEs, 
> NULL);
>               ereport(FATAL,
> -                             (errmsg("requested timeline %u is not a child 
> of this server's history",
> +                             (errcode(ERRCODE_DATA_CORRUPTED),
> +                              errmsg("requested timeline %u is not a child 
> of this server's history",
>                                               recoveryTargetTLI),
>                                errdetail("Latest checkpoint is at %X/%X on 
> timeline %u, but in the history of the requested timeline, the server forked 
> off from that timeline at %X/%X.",
>                                                  
> LSN_FORMAT_ARGS(ControlFile->checkPoint),

Hm, this one arguably is not corruption, but we still cannot
continue. ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE or maybe a new error code?

Greetings,

Andres Freund

Re: Add missing error codes to PANIC/FATAL error reports in xlogrecovery

Reply via email to