subject:"Make mesage at end\-of\-recovery less scary."

Re: Make mesage at end-of-recovery less scary.

2024-01-22 Thread Kyotaro Horiguchi

At Wed, 17 Jan 2024 14:32:00 +0900, Michael Paquier  wrote 
in 
> On Tue, Jan 16, 2024 at 02:46:02PM +0300, Aleksander Alekseev wrote:
> >> For now, let me explain the basis for this patch. The fundamental
> >> issue is that these warnings that always appear are, in practice, not
> >> a problem in almost all cases. Some of those who encounter them for
> >> the first time may feel uneasy and reach out with inquiries. On the
> >> other hand, those familiar with these warnings tend to ignore them and
> >> only pay attention to details when actual issues arise. Therefore, the
> >> intention of this patch is to label them as "no issue" unless a
> >> problem is blatantly evident, in order to prevent unnecessary concern.
> > 
> > I agree and don't mind affecting the error message per se.
> > 
> > However I see that the actual logic of how WAL is processed is being
> > changed. If we do this, at very least it requires thorough thinking. I
> > strongly suspect that the proposed code is wrong and/or not safe
> > and/or less safe than it is now for the reasons named above.
> 
> FWIW, that pretty much sums up my feeling regarding this patch,
> because an error, basically any error, would hurt back very badly.
> Sure, the error messages we generate now when reaching the end of WAL
> can sound scary, and they are (I suspect that's not really the case
> for anybody who has history doing support with PostgreSQL because a
> bunch of these messages are old enough to vote, but I can understand
> that anybody would freak out the first time they see that).
> 
> However, per the recent issues we've had in this area, like
> cd7f19da3468 but I'm more thinking about 6b18b3fe2c2f and
> bae868caf222, I am of the opinion that the header validation, the
> empty page case in XLogReaderValidatePageHeader() and the record read
> changes are risky enough that I am not convinced that the gains are
> worth the risks taken.
> 
> The error stack in the WAL reader is complicated enough that making it
> more complicated as the patch proposes does not sound like not a good
> tradeoff to me to make the reports related to the end of WAL cleaner
> for the end-user.  I agree that we should do something, but the patch
> does not seem like a good step towards this goal.  Perhaps somebody
> would be more excited about this proposal than I am, of course.

Thank you both for the comments. The criticism seems valid. The
approach to identifying the end-of-WAL state in this patch is quite
heuristic, and its validity or safety can certainly be contested. On
the other hand, if we seek perfection in this area of judgment, we may
need to have the WAL format itself more robust. In any case, since the
majority of the feedback on this patch seems to be negative, I am
going to withdraw it if no supportive opinions emerge during this
commit-fest.

The attached patch addresses the errors reported by CF-bot.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 933d10fa6c7b71e4684f5ba38e85177afaa56f58 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Tue, 7 Mar 2023 14:55:58 +0900
Subject: [PATCH v29] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.  To make
sure that the detection is correct, this patch checks if all trailing
bytes in the same page are zeroed in that case.
---
 src/backend/access/transam/xlogreader.c   | 147 ++
 src/backend/access/transam/xlogrecovery.c |  96 ++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 +-
 src/bin/pg_waldump/t/001_basic.pl |   5 +-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/035_recovery.pl   | 130 +++
 src/test/recovery/t/039_end_of_wal.pl |  47 +--
 8 files changed, 383 insertions(+), 63 deletions(-)
 create mode 100644 src/test/recovery/t/035_recovery.pl

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 7190156f2f..94861969eb 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -48,6 +48,8 @@ static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
 static XLogPageReadResult XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking);
+static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+  XLogRecord *record);
 static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
   XLogRecPtr PrevRecPtr, XLogRecord *record, bool

Re: Make mesage at end-of-recovery less scary.

2024-01-22 Thread Kyotaro Horiguchi

At Mon, 22 Jan 2024 16:09:28 +1100, Peter Smith  wrote 
in 
> 2024-01 Commitfest.
> 
> Hi, This patch has a CF status of "Needs Review" [1], but it seems
> there were CFbot test failures last time it was run [2]. Please have a
> look and post an updated version if necessary.
> 
> ==
> [1] https://commitfest.postgresql.org/46/2490/
> [2] 
> https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/46/2490

Thanks for noticing of that. Will repost a new version.
regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Make mesage at end-of-recovery less scary.

2024-01-21 Thread Peter Smith

2024-01 Commitfest.

Hi, This patch has a CF status of "Needs Review" [1], but it seems
there were CFbot test failures last time it was run [2]. Please have a
look and post an updated version if necessary.

==
[1] https://commitfest.postgresql.org/46/2490/
[2] https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/46/2490

Kind Regards,
Peter Smith.

Re: Make mesage at end-of-recovery less scary.

2024-01-16 Thread Michael Paquier

On Tue, Jan 16, 2024 at 02:46:02PM +0300, Aleksander Alekseev wrote:
>> For now, let me explain the basis for this patch. The fundamental
>> issue is that these warnings that always appear are, in practice, not
>> a problem in almost all cases. Some of those who encounter them for
>> the first time may feel uneasy and reach out with inquiries. On the
>> other hand, those familiar with these warnings tend to ignore them and
>> only pay attention to details when actual issues arise. Therefore, the
>> intention of this patch is to label them as "no issue" unless a
>> problem is blatantly evident, in order to prevent unnecessary concern.
> 
> I agree and don't mind affecting the error message per se.
> 
> However I see that the actual logic of how WAL is processed is being
> changed. If we do this, at very least it requires thorough thinking. I
> strongly suspect that the proposed code is wrong and/or not safe
> and/or less safe than it is now for the reasons named above.

FWIW, that pretty much sums up my feeling regarding this patch,
because an error, basically any error, would hurt back very badly.
Sure, the error messages we generate now when reaching the end of WAL
can sound scary, and they are (I suspect that's not really the case
for anybody who has history doing support with PostgreSQL because a
bunch of these messages are old enough to vote, but I can understand
that anybody would freak out the first time they see that).

However, per the recent issues we've had in this area, like
cd7f19da3468 but I'm more thinking about 6b18b3fe2c2f and
bae868caf222, I am of the opinion that the header validation, the
empty page case in XLogReaderValidatePageHeader() and the record read
changes are risky enough that I am not convinced that the gains are
worth the risks taken.

The error stack in the WAL reader is complicated enough that making it
more complicated as the patch proposes does not sound like not a good
tradeoff to me to make the reports related to the end of WAL cleaner
for the end-user.  I agree that we should do something, but the patch
does not seem like a good step towards this goal.  Perhaps somebody
would be more excited about this proposal than I am, of course.
--
Michael

signature.asc
Description: PGP signature

Re: Make mesage at end-of-recovery less scary.

2024-01-16 Thread Aleksander Alekseev

Hi,

> > If I understand correctly, if somehow several FS blocks end up being
> > zeroed (due to OS bug, bit rot, restoring from a corrupted for
> > whatever reason backup, hardware failures, ...) there is non-zero
> > chance that PG will interpret this as a normal situation. To my
> > knowledge this is not what we typically do - typically PG would report
> > an error and ask a human to figure out what happened. Of course the
> > possibility of such a scenario is small, but I don't think that as
> > DBMS developers we can ignore it.
>
> For now, let me explain the basis for this patch. The fundamental
> issue is that these warnings that always appear are, in practice, not
> a problem in almost all cases. Some of those who encounter them for
> the first time may feel uneasy and reach out with inquiries. On the
> other hand, those familiar with these warnings tend to ignore them and
> only pay attention to details when actual issues arise. Therefore, the
> intention of this patch is to label them as "no issue" unless a
> problem is blatantly evident, in order to prevent unnecessary concern.

I agree and don't mind affecting the error message per se.

However I see that the actual logic of how WAL is processed is being
changed. If we do this, at very least it requires thorough thinking. I
strongly suspect that the proposed code is wrong and/or not safe
and/or less safe than it is now for the reasons named above.

-- 
Best regards,
Aleksander Alekseev

Re: Make mesage at end-of-recovery less scary.

2024-01-15 Thread Kyotaro Horiguchi

Thank you for the comments.

At Fri, 12 Jan 2024 15:03:26 +0300, Aleksander Alekseev 
 wrote in 
> ```
> +p = (char *) record;
> +pe = p + XLOG_BLCKSZ - (RecPtr & (XLOG_BLCKSZ - 1));
> +
> +while (p < pe && *p == 0)
> +p++;
> +
> +if (p == pe)
> ```
> 
> Just as a random thought: perhaps we should make this a separate
> function, as a part of src/port/. It seems to me that this code could
> benefit from using vector instructions some day, similarly to
> memcmp(), memset() etc. Surprisingly there doesn't seem to be a
> standard C function for this. Alternatively one could argue that one
> cycle doesn't make much code to reuse and that the C compiler will
> place SIMD instructions for us. However a counter-counter argument
> would be that we could use a macro or even better an inline function
> and have the same effect except getting a slightly more readable code.

Creating a function with a name like memcmp_byte() should be
straightforward, but implementing it with SIMD right away seems a bit
challenging. Similar operations are already being performed elsewhere
in the code, probably within the stats collector, where memcmp is used
with a statically allocated area that's filled with zeros. If we can
achieve a performance equivalent to memcmp with this new function,
then it definitely seems worth pursuing.

> ```
> - * This is just a convenience subroutine to avoid duplicated code in
> + * This is just a convenience subroutine to avoid duplicate code in
> ```
> 
> This change doesn't seem to be related to the patch. Personally I
> don't mind it though.

Ah, I'm sorry. That was something I mistakenly thought I had written
at the last moment and made modifications to.

> All in all I find v28 somewhat scary. It does much more than "making
> one message less scary" as it was initially intended and what bugged
> me personally, and accordingly touches many more places including
> xlogreader.c, xlogrecovery.c, etc.
> 
> Particularly I have mixed feeling about this:
> 
> ```
> +/*
> + * Consider it as end-of-WAL if all subsequent bytes of this page
> + * are zero. We don't bother checking the subsequent pages since
> + * they are not zeroed in the case of recycled segments.
> + */
> ```
> 
> If I understand correctly, if somehow several FS blocks end up being
> zeroed (due to OS bug, bit rot, restoring from a corrupted for
> whatever reason backup, hardware failures, ...) there is non-zero
> chance that PG will interpret this as a normal situation. To my
> knowledge this is not what we typically do - typically PG would report
> an error and ask a human to figure out what happened. Of course the
> possibility of such a scenario is small, but I don't think that as
> DBMS developers we can ignore it.

For now, let me explain the basis for this patch. The fundamental
issue is that these warnings that always appear are, in practice, not
a problem in almost all cases. Some of those who encounter them for
the first time may feel uneasy and reach out with inquiries. On the
other hand, those familiar with these warnings tend to ignore them and
only pay attention to details when actual issues arise. Therefore, the
intention of this patch is to label them as "no issue" unless a
problem is blatantly evident, in order to prevent unnecessary concern.

> Does anyone agree or maybe I'm making things up?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Make mesage at end-of-recovery less scary.

2024-01-12 Thread Aleksander Alekseev

Hi,

> The errors occurred in a part of the tests for end-of-WAL detection
> added in the master branch. These failures were primarily due to
> changes in the message contents introduced by this patch. During the
> revision, I discovered an issue with the handling of empty pages that
> appear in the middle of reading continuation records. In the previous
> version, such empty pages were mistakenly identified as indicating a
> clean end-of-WAL (that is a LOG). However, they should actually be
> handled as a WARNING, since the record curently being read is broken
> at the empty pages. The following changes have been made in this
> version:
>
> 1. Adjusting the test to align with the error message changes
>   introduced by this patch.
>
> 2. Adding tests for the newly added messages.
>
> 3. Correcting the handling of empty pages encountered during the
>   reading of continuation records. (XLogReaderValidatePageHeader)
>
> 4. Revising code comments.
>
> 5. Changing the term "log segment" to "WAL
>   segment". (XLogReaderValidatePageHeader)
>
> regards.

Thanks for the updated patch.

```
+p = (char *) record;
+pe = p + XLOG_BLCKSZ - (RecPtr & (XLOG_BLCKSZ - 1));
+
+while (p < pe && *p == 0)
+p++;
+
+if (p == pe)
```

Just as a random thought: perhaps we should make this a separate
function, as a part of src/port/. It seems to me that this code could
benefit from using vector instructions some day, similarly to
memcmp(), memset() etc. Surprisingly there doesn't seem to be a
standard C function for this. Alternatively one could argue that one
cycle doesn't make much code to reuse and that the C compiler will
place SIMD instructions for us. However a counter-counter argument
would be that we could use a macro or even better an inline function
and have the same effect except getting a slightly more readable code.

```
- * This is just a convenience subroutine to avoid duplicated code in
+ * This is just a convenience subroutine to avoid duplicate code in
```

This change doesn't seem to be related to the patch. Personally I
don't mind it though.

All in all I find v28 somewhat scary. It does much more than "making
one message less scary" as it was initially intended and what bugged
me personally, and accordingly touches many more places including
xlogreader.c, xlogrecovery.c, etc.

Particularly I have mixed feeling about this:

```
+/*
+ * Consider it as end-of-WAL if all subsequent bytes of this page
+ * are zero. We don't bother checking the subsequent pages since
+ * they are not zeroed in the case of recycled segments.
+ */
```

If I understand correctly, if somehow several FS blocks end up being
zeroed (due to OS bug, bit rot, restoring from a corrupted for
whatever reason backup, hardware failures, ...) there is non-zero
chance that PG will interpret this as a normal situation. To my
knowledge this is not what we typically do - typically PG would report
an error and ask a human to figure out what happened. Of course the
possibility of such a scenario is small, but I don't think that as
DBMS developers we can ignore it.

Does anyone agree or maybe I'm making things up?

-- 
Best regards,
Aleksander Alekseev

Re: Make mesage at end-of-recovery less scary.

2024-01-10 Thread Kyotaro Horiguchi

At Fri, 5 Jan 2024 16:02:24 +0530, vignesh C  wrote in 
> On Wed, 22 Nov 2023 at 13:01, Kyotaro Horiguchi  
> wrote:
> >
> > Anyway, this requires rebsaing, and done.
> 
> Few tests are failing at [1], kindly post an updated patch:

Thanks!

The errors occurred in a part of the tests for end-of-WAL detection
added in the master branch. These failures were primarily due to
changes in the message contents introduced by this patch. During the
revision, I discovered an issue with the handling of empty pages that
appear in the middle of reading continuation records. In the previous
version, such empty pages were mistakenly identified as indicating a
clean end-of-WAL (that is a LOG). However, they should actually be
handled as a WARNING, since the record curently being read is broken
at the empty pages. The following changes have been made in this
version:

1. Adjusting the test to align with the error message changes
  introduced by this patch.

2. Adding tests for the newly added messages.

3. Correcting the handling of empty pages encountered during the
  reading of continuation records. (XLogReaderValidatePageHeader)

4. Revising code comments.

5. Changing the term "log segment" to "WAL
  segment". (XLogReaderValidatePageHeader)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 47cc39a212d7fd6857f30c35c76bcdd0d26bbc3f Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Tue, 7 Mar 2023 14:55:58 +0900
Subject: [PATCH v28] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.  To make
sure that the detection is correct, this patch checks if all trailing
bytes in the same page are zeroed in that case.
---
 src/backend/access/transam/xlogreader.c   | 147 ++
 src/backend/access/transam/xlogrecovery.c |  96 ++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 +-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/035_recovery.pl   | 130 +++
 src/test/recovery/t/039_end_of_wal.pl |  47 +--
 7 files changed, 380 insertions(+), 61 deletions(-)
 create mode 100644 src/test/recovery/t/035_recovery.pl

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 7190156f2f..94861969eb 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -48,6 +48,8 @@ static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
 static XLogPageReadResult XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking);
+static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+  XLogRecord *record);
 static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
   XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
 static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
@@ -149,6 +151,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -553,6 +556,7 @@ XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking)
 	/* reset error state */
 	state->errormsg_buf[0] = '\0';
 	decoded = NULL;
+	state->EndOfWAL = false;
 
 	state->abortedRecPtr = InvalidXLogRecPtr;
 	state->missingContrecPtr = InvalidXLogRecPtr;
@@ -636,16 +640,12 @@ restart:
 	Assert(pageHeaderSize <= readOff);
 
 	/*
-	 * Read the record length.
+	 * Verify the record header.
 	 *
 	 * NB: Even though we use an XLogRecord pointer here, the whole record
-	 * header might not fit on this page. xl_tot_len is the first field of the
-	 * struct, so it must be on this page (the records are MAXALIGNed), but we
-	 * cannot access any other fields until we've verified that we got the
-	 * whole header.
+	 * header might not fit on this page.
 	 */
 	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
-	total_len = record->xl_tot_len;
 
 	/*
 	 * If the whole record header is on this page, validate it immediately.
@@ -664,19 +664,19 @@ restart:
 	}
 	else
 	{
-		/* There may be no next page if it's too small. */
-		if (total_len < SizeOfXLogRecord)
-		{
-			report_invalid_record(state,
-  "invalid record length at %X/%X: expected at least %u, got %u",
-  LSN_FORMAT_ARGS(RecPtr),
-  (uint32) SizeOfXLogRecord, total_len);
+		/*
+		 * xl_tot_len is the first field of the struct, so it must be on this
+		 * page (the records are MAXALIGNed), but we cannot

Re: Make mesage at end-of-recovery less scary.

2024-01-05 Thread vignesh C

On Wed, 22 Nov 2023 at 13:01, Kyotaro Horiguchi  wrote:
>
> Anyway, this requires rebsaing, and done.

Few tests are failing at [1], kindly post an updated patch:
/tmp/cirrus-ci-build/src/test/recovery --testgroup recovery --testname
039_end_of_wal -- /usr/local/bin/perl -I
/tmp/cirrus-ci-build/src/test/perl -I
/tmp/cirrus-ci-build/src/test/recovery
/tmp/cirrus-ci-build/src/test/recovery/t/039_end_of_wal.pl
[23:53:10.370] ― ✀
―
[23:53:10.370] stderr:
[23:53:10.370] # Failed test 'xl_tot_len zero'
[23:53:10.370] # at
/tmp/cirrus-ci-build/src/test/recovery/t/039_end_of_wal.pl line 267.
[23:53:10.370] # Failed test 'xlp_magic zero'
[23:53:10.370] # at
/tmp/cirrus-ci-build/src/test/recovery/t/039_end_of_wal.pl line 340.
[23:53:10.370] # Failed test 'xlp_magic zero (split record header)'
[23:53:10.370] # at
/tmp/cirrus-ci-build/src/test/recovery/t/039_end_of_wal.pl line 445.
[23:53:10.370] # Looks like you failed 3 tests of 14.
[23:53:10.370]
[23:53:10.370] (test program exited with status code 3)
[23:53:10.370] 
――

[1] - https://cirrus-ci.com/task/5859293157654528

Regards,
Vignesh

Re: Make mesage at end-of-recovery less scary.

2023-11-21 Thread Kyotaro Horiguchi

Anyway, this requires rebsaing, and done.

Thanks for John (Naylor) for pointing this out.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From e56f1f24523e3e562a4db166dfeaadc79fd7b27a Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Tue, 7 Mar 2023 14:55:58 +0900
Subject: [PATCH v27] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.  To make
sure that the detection is correct, this patch checks if all trailing
bytes in the same page are zeroed in that case.
---
 src/backend/access/transam/xlogreader.c   | 134 ++
 src/backend/access/transam/xlogrecovery.c |  94 +++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 ++-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/035_recovery.pl   | 130 +
 6 files changed, 327 insertions(+), 52 deletions(-)
 create mode 100644 src/test/recovery/t/035_recovery.pl

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index e0baa86bd3..ce65f99c60 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -48,6 +48,8 @@ static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
 static XLogPageReadResult XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking);
+static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+  XLogRecord *record);
 static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
   XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
 static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
@@ -149,6 +151,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -525,6 +528,7 @@ XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking)
 	/* reset error state */
 	state->errormsg_buf[0] = '\0';
 	decoded = NULL;
+	state->EndOfWAL = false;
 
 	state->abortedRecPtr = InvalidXLogRecPtr;
 	state->missingContrecPtr = InvalidXLogRecPtr;
@@ -608,16 +612,12 @@ restart:
 	Assert(pageHeaderSize <= readOff);
 
 	/*
-	 * Read the record length.
+	 * Verify the record header.
 	 *
 	 * NB: Even though we use an XLogRecord pointer here, the whole record
-	 * header might not fit on this page. xl_tot_len is the first field of the
-	 * struct, so it must be on this page (the records are MAXALIGNed), but we
-	 * cannot access any other fields until we've verified that we got the
-	 * whole header.
+	 * header might not fit on this page.
 	 */
 	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
-	total_len = record->xl_tot_len;
 
 	/*
 	 * If the whole record header is on this page, validate it immediately.
@@ -636,19 +636,19 @@ restart:
 	}
 	else
 	{
-		/* There may be no next page if it's too small. */
-		if (total_len < SizeOfXLogRecord)
-		{
-			report_invalid_record(state,
-  "invalid record length at %X/%X: expected at least %u, got %u",
-  LSN_FORMAT_ARGS(RecPtr),
-  (uint32) SizeOfXLogRecord, total_len);
+		/*
+		 * xl_tot_len is the first field of the struct, so it must be on this
+		 * page (the records are MAXALIGNed), but we cannot access any other
+		 * fields until we've verified that we got the whole header.
+		 */
+		if (!ValidXLogRecordLength(state, RecPtr, record))
 			goto err;
-		}
-		/* We'll validate the header once we have the next page. */
+
 		gotheader = false;
 	}
 
+	total_len = record->xl_tot_len;
+
 	/*
 	 * Try to find space to decode this record, if we can do so without
 	 * calling palloc.  If we can't, we'll try again below after we've
@@ -1091,25 +1091,81 @@ XLogReaderInvalReadState(XLogReaderState *state)
 	state->readLen = 0;
 }
 
+/*
+ * Validate record length of an XLOG record header.
+ *
+ * This is substantially a part of ValidXLogRecordHeader.  But XLogReadRecord
+ * needs this separate from the function in case of a partial record header.
+ *
+ * Returns true if the xl_tot_len header field has a seemingly valid value,
+ * which means the caller can proceed reading to the following part of the
+ * record.
+ */
+static bool
+ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+	  XLogRecord *record)
+{
+	if (record->xl_tot_len == 0)
+	{
+		char	   *p;
+		char	   *pe;
+
+		/*
+		 * We are almost sure reaching the end of WAL, make sure that the
+		 * whole page after the

Re: Make mesage at end-of-recovery less scary.

2023-07-19 Thread Kyotaro Horiguchi

At Mon, 17 Jul 2023 15:20:30 +0300, Aleksander Alekseev 
 wrote in 
> Thanks for working on this, it bugged me for a while. I noticed that
> cfbot is not happy with the patch so I rebased it.
> postgresql:pg_waldump test suite didn't pass after the rebase. I fixed
> it too. Other than that the patch LGTM so I'm not changing its status
> from "Ready for Committer".

Thanks for the rebasing.

> It looks like the patch was moved between the commitfests since
> 2020... If there is anything that may help merging it into PG17 please
> let me know.

This might be just too-much or there might be some doubt in this..

This change basically makes a zero-length record be considered as the
normal end of WAL.

The most controvorsial point I think in the design is the criteria for
an error condition. The assumption is that the WAL is sound if all
bytes following a complete record, up to the next page boundary, are
zeroed out. This is slightly narrower than the original criteria,
merely checking the next record is zero-length.  Naturally, there
might be instances where that page has been blown out due to device
failure or some other reasons. Despite this, I believe it is
preferable rather than always issuing a warning (in the LOG level,
though) about a potential WAL corruption.

I've adjusted the condition for muting repeated log messages at the
same LSN, changing it from ==LOG to <=WARNING. This is simply a
consequence of following the change of "real" warnings from LOG to
WARNING. I believe this is acceptable even without considering
aforementioned change, as any single retriable (

Re: Make mesage at end-of-recovery less scary.

2023-07-17 Thread Aleksander Alekseev

Hi,

> Thanks for checking it!
>
> I think 4ac30ba4f2 is that, which changes a few error
> messages. Addition to rebasing, I rewrote some code comments of
> xlogreader.c and revised the additional test script.

Thanks for working on this, it bugged me for a while. I noticed that
cfbot is not happy with the patch so I rebased it.
postgresql:pg_waldump test suite didn't pass after the rebase. I fixed
it too. Other than that the patch LGTM so I'm not changing its status
from "Ready for Committer".

It looks like the patch was moved between the commitfests since
2020... If there is anything that may help merging it into PG17 please
let me know.

-- 
Best regards,
Aleksander Alekseev


v26-0001-Make-End-Of-Recovery-error-less-scary.patch
Description: Binary data

Re: Make mesage at end-of-recovery less scary.

2023-03-06 Thread Kyotaro Horiguchi

At Mon, 6 Mar 2023 14:58:15 -0500, "Gregory Stark (as CFM)" 
 wrote in 
> It looks like this needs a rebase and at a quick glance it looks like
> more than a trivial conflict. I'll mark it Waiting on Author. Please
> update it back when it's rebased

Thanks for checking it!

I think 4ac30ba4f2 is that, which changes a few error
messages. Addition to rebasing, I rewrote some code comments of
xlogreader.c and revised the additional test script.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From d8b7ff96f48ede26390fa2208460ee2a8ea9cd87 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Tue, 7 Mar 2023 14:55:58 +0900
Subject: [PATCH v25] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.  To make
sure that the detection is correct, this patch checks if all trailing
bytes in the same page are zeroed in that case.
---
 src/backend/access/transam/xlogreader.c   | 135 ++
 src/backend/access/transam/xlogrecovery.c |  94 +++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 ++-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/035_recovery.pl   | 130 +
 6 files changed, 329 insertions(+), 51 deletions(-)
 create mode 100644 src/test/recovery/t/035_recovery.pl

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index cadea21b37..5a27c10bbb 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -48,6 +48,8 @@ static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
 static XLogPageReadResult XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking);
+static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+  XLogRecord *record);
 static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
   XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
 static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
@@ -149,6 +151,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -558,6 +561,7 @@ XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking)
 	/* reset error state */
 	state->errormsg_buf[0] = '\0';
 	decoded = NULL;
+	state->EndOfWAL = false;
 
 	state->abortedRecPtr = InvalidXLogRecPtr;
 	state->missingContrecPtr = InvalidXLogRecPtr;
@@ -641,16 +645,12 @@ restart:
 	Assert(pageHeaderSize <= readOff);
 
 	/*
-	 * Read the record length.
+	 * Verify the record header.
 	 *
 	 * NB: Even though we use an XLogRecord pointer here, the whole record
-	 * header might not fit on this page. xl_tot_len is the first field of the
-	 * struct, so it must be on this page (the records are MAXALIGNed), but we
-	 * cannot access any other fields until we've verified that we got the
-	 * whole header.
+	 * header might not fit on this page.
 	 */
 	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
-	total_len = record->xl_tot_len;
 
 	/*
 	 * If the whole record header is on this page, validate it immediately.
@@ -669,18 +669,21 @@ restart:
 	}
 	else
 	{
-		/* XXX: more validation should be done here */
-		if (total_len < SizeOfXLogRecord)
-		{
-			report_invalid_record(state,
-  "invalid record length at %X/%X: expected at least %u, got %u",
-  LSN_FORMAT_ARGS(RecPtr),
-  (uint32) SizeOfXLogRecord, total_len);
+		/*
+		 * xl_tot_len is the first field of the struct, so it must be on this
+		 * page (the records are MAXALIGNed), but we cannot access any other
+		 * fields until we've verified that we got the whole header.
+		 *
+		 * XXX: more validation should be done here
+		 */
+		if (!ValidXLogRecordLength(state, RecPtr, record))
 			goto err;
-		}
+
 		gotheader = false;
 	}
 
+	total_len = record->xl_tot_len;
+
 	/*
 	 * Find space to decode this record.  Don't allow oversized allocation if
 	 * the caller requested nonblocking.  Otherwise, we *have* to try to
@@ -1106,25 +1109,81 @@ XLogReaderInvalReadState(XLogReaderState *state)
 	state->readLen = 0;
 }
 
+/*
+ * Validate record length of an XLOG record header.
+ *
+ * This is substantially a part of ValidXLogRecordHeader.  But XLogReadRecord
+ * needs this separate from the function in case of a partial record header.
+ *
+ * Returns true if the xl_tot_len header field has a seemingly valid value,
+ * which means the caller

Re: Make mesage at end-of-recovery less scary.

2023-03-06 Thread Gregory Stark (as CFM)

It looks like this needs a rebase and at a quick glance it looks like
more than a trivial conflict. I'll mark it Waiting on Author. Please
update it back when it's rebased




--
Gregory Stark
As Commitfest Manager

Re: Make mesage at end-of-recovery less scary.

2023-02-06 Thread Kyotaro Horiguchi

Thanks!

At Fri, 3 Feb 2023 15:16:02 +0100, Alvaro Herrera  
wrote in 
> So this patch is now failing because it applies new tests to
> 011_crash_recovery.pl, which was removed recently.  Can you please move
> them elsewhere?

I don't find an appropriate file to move to. In the end I created a
new file with the name 034_recovery.pl.  I added a test for standbys,
too. (which is the first objective of this patch.)

> I think the comment for ValidXLogRecordLength should explain what the
> return value is.

Agreed.


/*
  * Validate record length of an XLOG record header.
  *
  * This is substantially a part of ValidXLogRecordHeader.  But XLogReadRecord
  * needs this separate from the function in case of a partial record header.
+ *
+ * Returns true if the xl_tot_len header field has a seemingly valid value,
+ * which means the caller can proceed reading to the following part of the
+ * record.
  */
 static bool
 ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,

I added a similar description to ValidXLogRecordHeader.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From c58ca4d5db52c75dec9882d158d5724e12617005 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Wed, 30 Nov 2022 11:51:46 +0900
Subject: [PATCH v24] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.  To make
sure that the detection is correct, this patch checks if all trailing
bytes in the same page are zeroed in that case.
---
 src/backend/access/transam/xlogreader.c   | 144 +-
 src/backend/access/transam/xlogrecovery.c |  94 ++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 +-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/034_recovery.pl   | 135 
 6 files changed, 335 insertions(+), 59 deletions(-)
 create mode 100644 src/test/recovery/t/034_recovery.pl

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index aa6c929477..8cb2d55333 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -48,6 +48,8 @@ static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
 static XLogPageReadResult XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking);
+static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+  XLogRecord *record);
 static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
   XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
 static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
@@ -149,6 +151,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -558,6 +561,7 @@ XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking)
 	/* reset error state */
 	state->errormsg_buf[0] = '\0';
 	decoded = NULL;
+	state->EndOfWAL = false;
 
 	state->abortedRecPtr = InvalidXLogRecPtr;
 	state->missingContrecPtr = InvalidXLogRecPtr;
@@ -640,25 +644,21 @@ restart:
 	Assert(pageHeaderSize <= readOff);
 
 	/*
-	 * Read the record length.
+	 * Validate the record header.
 	 *
-	 * NB: Even though we use an XLogRecord pointer here, the whole record
-	 * header might not fit on this page. xl_tot_len is the first field of the
-	 * struct, so it must be on this page (the records are MAXALIGNed), but we
-	 * cannot access any other fields until we've verified that we got the
-	 * whole header.
-	 */
-	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
-	total_len = record->xl_tot_len;
-
-	/*
-	 * If the whole record header is on this page, validate it immediately.
-	 * Otherwise do just a basic sanity check on xl_tot_len, and validate the
-	 * rest of the header after reading it from the next page.  The xl_tot_len
+	 * Even though we use an XLogRecord pointer here, the whole record header
+	 * might not fit on this page.  If the whole record header is on this page,
+	 * validate it immediately.  Even otherwise xl_tot_len must be on this page
+	 * (it is the first field of MAXALIGNed records), but we still cannot
+	 * access any further fields until we've verified that we got the whole
+	 * header, so do just a basic sanity check on record length, and validate
+	 * the rest of the header after reading it from the next page.  The length
 	 * check is necessary here to ensure that we enter the "Need to reassemble
 	 *

Re: Make mesage at end-of-recovery less scary.

2023-02-03 Thread Alvaro Herrera

So this patch is now failing because it applies new tests to
011_crash_recovery.pl, which was removed recently.  Can you please move
them elsewhere?

I think the comment for ValidXLogRecordLength should explain what the
return value is.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/

Re: Make mesage at end-of-recovery less scary.

2022-11-29 Thread Kyotaro Horiguchi

At Tue, 22 Nov 2022 16:04:56 -0600, Justin Pryzby  wrote 
in 
> On Fri, Nov 18, 2022 at 05:25:37PM +0900, Kyotaro Horiguchi wrote:
> > +   while (*p == 0 && p < pe)
> > +   p++;
> 
> The bug reported by Andres/cfbot/ubsan is here.
>
> Fixed in attached.

Ur..ou..

-   while (*p == 0 && p < pe)
+   while (p < pe && *p == 0)

It was an off-by-one error. Thanks!

> I didn't try to patch the test case to output the failing stderr, but
> that might be good.

I have made use of Cluster::wait_for_log(), but still find_in_log() is
there since it is used to check if a message that should not be logged
is not logged.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 37dd82652bf028811002af9e7c0df9e9d2ddb7d7 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Wed, 30 Nov 2022 11:51:46 +0900
Subject: [PATCH v23] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlogreader.c   | 137 +-
 src/backend/access/transam/xlogrecovery.c |  94 +++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 +-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl |  96 +++
 6 files changed, 289 insertions(+), 59 deletions(-)

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 93f667b254..137de96795 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -48,6 +48,8 @@ static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
 static XLogPageReadResult XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking);
+static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+  XLogRecord *record);
 static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
   XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
 static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
@@ -149,6 +151,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -558,6 +561,7 @@ XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking)
 	/* reset error state */
 	state->errormsg_buf[0] = '\0';
 	decoded = NULL;
+	state->EndOfWAL = false;
 
 	state->abortedRecPtr = InvalidXLogRecPtr;
 	state->missingContrecPtr = InvalidXLogRecPtr;
@@ -640,25 +644,21 @@ restart:
 	Assert(pageHeaderSize <= readOff);
 
 	/*
-	 * Read the record length.
+	 * Validate the record header.
 	 *
-	 * NB: Even though we use an XLogRecord pointer here, the whole record
-	 * header might not fit on this page. xl_tot_len is the first field of the
-	 * struct, so it must be on this page (the records are MAXALIGNed), but we
-	 * cannot access any other fields until we've verified that we got the
-	 * whole header.
-	 */
-	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
-	total_len = record->xl_tot_len;
-
-	/*
-	 * If the whole record header is on this page, validate it immediately.
-	 * Otherwise do just a basic sanity check on xl_tot_len, and validate the
-	 * rest of the header after reading it from the next page.  The xl_tot_len
+	 * Even though we use an XLogRecord pointer here, the whole record header
+	 * might not fit on this page.  If the whole record header is on this page,
+	 * validate it immediately.  Even otherwise xl_tot_len must be on this page
+	 * (it is the first field of MAXALIGNed records), but we still cannot
+	 * access any further fields until we've verified that we got the whole
+	 * header, so do just a basic sanity check on record length, and validate
+	 * the rest of the header after reading it from the next page.  The length
 	 * check is necessary here to ensure that we enter the "Need to reassemble
 	 * record" code path below; otherwise we might fail to apply
 	 * ValidXLogRecordHeader at all.
 	 */
+	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
+
 	if (targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord)
 	{
 		if (!ValidXLogRecordHeader(state, RecPtr, state->DecodeRecPtr, record,
@@ -668,18 +668,14 @@ restart:
 	}
 	else
 	{
-		/* XXX: more validation should be done here */
-		if (total_len < SizeOfXLogRecord)
-		{
-			report_invalid_record(state,
-  "invalid record length at %X/%X: wanted %u, got %u",
-

Re: Make mesage at end-of-recovery less scary.

2022-11-22 Thread Justin Pryzby

On Fri, Nov 18, 2022 at 05:25:37PM +0900, Kyotaro Horiguchi wrote:
> + while (*p == 0 && p < pe)
> + p++;

The bug reported by Andres/cfbot/ubsan is here.

Fixed in attached.

I didn't try to patch the test case to output the failing stderr, but
that might be good.
>From 9bdf59ed0d78fff3f690584fc3c49c863d53f321 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Tue, 15 Nov 2022 13:41:46 +0900
Subject: [PATCH] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlogreader.c   | 135 +-
 src/backend/access/transam/xlogrecovery.c |  94 +++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 ++-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 106 +
 6 files changed, 298 insertions(+), 58 deletions(-)

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 93f667b2544..137de967951 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -48,6 +48,8 @@ static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
 static XLogPageReadResult XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking);
+static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+  XLogRecord *record);
 static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
   XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
 static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
@@ -149,6 +151,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -558,6 +561,7 @@ XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking)
 	/* reset error state */
 	state->errormsg_buf[0] = '\0';
 	decoded = NULL;
+	state->EndOfWAL = false;
 
 	state->abortedRecPtr = InvalidXLogRecPtr;
 	state->missingContrecPtr = InvalidXLogRecPtr;
@@ -640,25 +644,21 @@ restart:
 	Assert(pageHeaderSize <= readOff);
 
 	/*
-	 * Read the record length.
+	 * Validate the record header.
 	 *
-	 * NB: Even though we use an XLogRecord pointer here, the whole record
-	 * header might not fit on this page. xl_tot_len is the first field of the
-	 * struct, so it must be on this page (the records are MAXALIGNed), but we
-	 * cannot access any other fields until we've verified that we got the
-	 * whole header.
-	 */
-	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
-	total_len = record->xl_tot_len;
-
-	/*
-	 * If the whole record header is on this page, validate it immediately.
-	 * Otherwise do just a basic sanity check on xl_tot_len, and validate the
-	 * rest of the header after reading it from the next page.  The xl_tot_len
+	 * Even though we use an XLogRecord pointer here, the whole record header
+	 * might not fit on this page.  If the whole record header is on this page,
+	 * validate it immediately.  Even otherwise xl_tot_len must be on this page
+	 * (it is the first field of MAXALIGNed records), but we still cannot
+	 * access any further fields until we've verified that we got the whole
+	 * header, so do just a basic sanity check on record length, and validate
+	 * the rest of the header after reading it from the next page.  The length
 	 * check is necessary here to ensure that we enter the "Need to reassemble
 	 * record" code path below; otherwise we might fail to apply
 	 * ValidXLogRecordHeader at all.
 	 */
+	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
+
 	if (targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord)
 	{
 		if (!ValidXLogRecordHeader(state, RecPtr, state->DecodeRecPtr, record,
@@ -668,18 +668,14 @@ restart:
 	}
 	else
 	{
-		/* XXX: more validation should be done here */
-		if (total_len < SizeOfXLogRecord)
-		{
-			report_invalid_record(state,
-  "invalid record length at %X/%X: wanted %u, got %u",
-  LSN_FORMAT_ARGS(RecPtr),
-  (uint32) SizeOfXLogRecord, total_len);
+		if (!ValidXLogRecordLength(state, RecPtr, record))
 			goto err;
-		}
+
 		gotheader = false;
 	}
 
+	total_len = record->xl_tot_len;
+
 	/*
 	 * Find space to decode this record.  Don't allow oversized allocation if
 	 * the caller requested nonblocking.  Otherwise, we *have* to try to
@@ -1106,16 +1102,47 @@ XLogReaderInvalReadState(XLogReaderState *state)
 }
 
 /*
- *

Re: Make mesage at end-of-recovery less scary.

2022-11-22 Thread Andres Freund

Hi,

On 2022-11-18 17:25:37 +0900, Kyotaro Horiguchi wrote:
> Just rebased.

Fails with address sanitizer:
https://cirrus-ci.com/task/5632986241564672

Unfortunately one of the failures is in pg_waldump and we don't seem to
capture its output in 011_crash_recovery. So we don't see the nice formattted
output...

[11:07:18.868] #0  0x7fcf43803ce1 in raise () from 
/lib/x86_64-linux-gnu/libc.so.6
[11:07:18.912] 
[11:07:18.912] Thread 1 (Thread 0x7fcf43662780 (LWP 39124)):
[11:07:18.912] #0  0x7fcf43803ce1 in raise () from 
/lib/x86_64-linux-gnu/libc.so.6
[11:07:18.912] No symbol table info available.
[11:07:18.912] #1  0x7fcf437ed537 in abort () from 
/lib/x86_64-linux-gnu/libc.so.6
[11:07:18.912] No symbol table info available.
[11:07:18.912] #2  0x7fcf43b8511b in __sanitizer::Abort () at 
../../../../src/libsanitizer/sanitizer_common/sanitizer_posix_libcdep.cpp:155
[11:07:18.912] No locals.
[11:07:18.912] #3  0x7fcf43b8fce8 in __sanitizer::Die () at 
../../../../src/libsanitizer/sanitizer_common/sanitizer_termination.cpp:58
[11:07:18.912] No locals.
[11:07:18.912] #4  0x7fcf43b7244c in 
__asan::ScopedInErrorReport::~ScopedInErrorReport (this=0x7ffd4fde18e6, 
__in_chrg=) at 
../../../../src/libsanitizer/asan/asan_report.cpp:186
[11:07:18.912] buffer_copy = 
{<__sanitizer::InternalMmapVectorNoCtor> = {data_ = 0x7fcf4035 '=' 
, "\n==39124==ERROR: AddressSanitizer: heap-buffer-overflow 
on address 0x62502100 at pc 0x55c36c21e315 bp 0x7ffd4fde2550 sp 
0x7ffd4fde2"..., capacity_bytes_ = 65536, size_ = }, }
...
[11:07:18.912] #6  0x7fcf43b72788 in __asan::__asan_report_load1 
(addr=) at ../../../../src/libsanitizer/asan/asan_rtl.cpp:117
[11:07:18.912] bp = 140725943412048
[11:07:18.912] pc = 
[11:07:18.912] local_stack = 140528180793728
[11:07:18.912] sp = 140725943412040
[11:07:18.912] #7  0x55c36c21e315 in ValidXLogRecordLength 
(state=state@entry=0x61a00680, RecPtr=RecPtr@entry=33655480, 
record=record@entry=0x62500bb8) at xlogreader.c:1126
[11:07:18.912] p = 
[11:07:18.912] pe = 0x62502100 ""
[11:07:18.912] #8  0x55c36c21e3b1 in ValidXLogRecordHeader 
(state=state@entry=0x61a00680, RecPtr=RecPtr@entry=33655480, 
PrevRecPtr=33655104, record=record@entry=0x62500bb8, 
randAccess=randAccess@entry=false) at xlogreader.c:1169
[11:07:18.912] No locals.

The  most important bit is "AddressSanitizer: heap-buffer-overflow on address 
0x625\
02100 at pc 0x55c36c21e315 bp 0x7ffd4fde2550 sp 0x7ffd4fde2"

Greetings,

Andres Freund

Re: Make mesage at end-of-recovery less scary.

2022-11-18 Thread Kyotaro Horiguchi

Just rebased.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 1efe0601596807c25769370f38884c7027a00839 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Tue, 15 Nov 2022 13:41:46 +0900
Subject: [PATCH v22] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlogreader.c   | 137 +-
 src/backend/access/transam/xlogrecovery.c |  94 +++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 +-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 106 +
 6 files changed, 299 insertions(+), 59 deletions(-)

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 93f667b254..f891a62944 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -48,6 +48,8 @@ static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
 static XLogPageReadResult XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking);
+static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+  XLogRecord *record);
 static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
   XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
 static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
@@ -149,6 +151,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -558,6 +561,7 @@ XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking)
 	/* reset error state */
 	state->errormsg_buf[0] = '\0';
 	decoded = NULL;
+	state->EndOfWAL = false;
 
 	state->abortedRecPtr = InvalidXLogRecPtr;
 	state->missingContrecPtr = InvalidXLogRecPtr;
@@ -640,25 +644,21 @@ restart:
 	Assert(pageHeaderSize <= readOff);
 
 	/*
-	 * Read the record length.
+	 * Validate the record header.
 	 *
-	 * NB: Even though we use an XLogRecord pointer here, the whole record
-	 * header might not fit on this page. xl_tot_len is the first field of the
-	 * struct, so it must be on this page (the records are MAXALIGNed), but we
-	 * cannot access any other fields until we've verified that we got the
-	 * whole header.
-	 */
-	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
-	total_len = record->xl_tot_len;
-
-	/*
-	 * If the whole record header is on this page, validate it immediately.
-	 * Otherwise do just a basic sanity check on xl_tot_len, and validate the
-	 * rest of the header after reading it from the next page.  The xl_tot_len
+	 * Even though we use an XLogRecord pointer here, the whole record header
+	 * might not fit on this page.  If the whole record header is on this page,
+	 * validate it immediately.  Even otherwise xl_tot_len must be on this page
+	 * (it is the first field of MAXALIGNed records), but we still cannot
+	 * access any further fields until we've verified that we got the whole
+	 * header, so do just a basic sanity check on record length, and validate
+	 * the rest of the header after reading it from the next page.  The length
 	 * check is necessary here to ensure that we enter the "Need to reassemble
 	 * record" code path below; otherwise we might fail to apply
 	 * ValidXLogRecordHeader at all.
 	 */
+	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
+
 	if (targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord)
 	{
 		if (!ValidXLogRecordHeader(state, RecPtr, state->DecodeRecPtr, record,
@@ -668,18 +668,14 @@ restart:
 	}
 	else
 	{
-		/* XXX: more validation should be done here */
-		if (total_len < SizeOfXLogRecord)
-		{
-			report_invalid_record(state,
-  "invalid record length at %X/%X: wanted %u, got %u",
-  LSN_FORMAT_ARGS(RecPtr),
-  (uint32) SizeOfXLogRecord, total_len);
+		if (!ValidXLogRecordLength(state, RecPtr, record))
 			goto err;
-		}
+
 		gotheader = false;
 	}
 
+	total_len = record->xl_tot_len;
+
 	/*
 	 * Find space to decode this record.  Don't allow oversized allocation if
 	 * the caller requested nonblocking.  Otherwise, we *have* to try to
@@ -1105,6 +1101,60 @@ XLogReaderInvalReadState(XLogReaderState *state)
 	state->readLen = 0;
 }
 
+/*
+ * Validate record length of an XLOG record header.
+ *
+ * This is substantially a part of ValidXLogRecordHeader.  But XLogReadRecord
+ * needs this separate from the function in case of a partial

Re: Make mesage at end-of-recovery less scary.

2022-10-27 Thread Justin Pryzby

rebased
>From 67ce65038ae6a7d5b023b7472df9f9ca9835d5f5 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Thu, 7 Jul 2022 11:51:45 +0900
Subject: [PATCH] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlogreader.c   | 135 +-
 src/backend/access/transam/xlogrecovery.c |  95 +++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 ++-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 106 +
 6 files changed, 299 insertions(+), 58 deletions(-)

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 93f667b2544..f891a629443 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -48,6 +48,8 @@ static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
 static XLogPageReadResult XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking);
+static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+  XLogRecord *record);
 static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
   XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
 static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
@@ -149,6 +151,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -558,6 +561,7 @@ XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking)
 	/* reset error state */
 	state->errormsg_buf[0] = '\0';
 	decoded = NULL;
+	state->EndOfWAL = false;
 
 	state->abortedRecPtr = InvalidXLogRecPtr;
 	state->missingContrecPtr = InvalidXLogRecPtr;
@@ -640,25 +644,21 @@ restart:
 	Assert(pageHeaderSize <= readOff);
 
 	/*
-	 * Read the record length.
+	 * Validate the record header.
 	 *
-	 * NB: Even though we use an XLogRecord pointer here, the whole record
-	 * header might not fit on this page. xl_tot_len is the first field of the
-	 * struct, so it must be on this page (the records are MAXALIGNed), but we
-	 * cannot access any other fields until we've verified that we got the
-	 * whole header.
-	 */
-	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
-	total_len = record->xl_tot_len;
-
-	/*
-	 * If the whole record header is on this page, validate it immediately.
-	 * Otherwise do just a basic sanity check on xl_tot_len, and validate the
-	 * rest of the header after reading it from the next page.  The xl_tot_len
+	 * Even though we use an XLogRecord pointer here, the whole record header
+	 * might not fit on this page.  If the whole record header is on this page,
+	 * validate it immediately.  Even otherwise xl_tot_len must be on this page
+	 * (it is the first field of MAXALIGNed records), but we still cannot
+	 * access any further fields until we've verified that we got the whole
+	 * header, so do just a basic sanity check on record length, and validate
+	 * the rest of the header after reading it from the next page.  The length
 	 * check is necessary here to ensure that we enter the "Need to reassemble
 	 * record" code path below; otherwise we might fail to apply
 	 * ValidXLogRecordHeader at all.
 	 */
+	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
+
 	if (targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord)
 	{
 		if (!ValidXLogRecordHeader(state, RecPtr, state->DecodeRecPtr, record,
@@ -668,18 +668,14 @@ restart:
 	}
 	else
 	{
-		/* XXX: more validation should be done here */
-		if (total_len < SizeOfXLogRecord)
-		{
-			report_invalid_record(state,
-  "invalid record length at %X/%X: wanted %u, got %u",
-  LSN_FORMAT_ARGS(RecPtr),
-  (uint32) SizeOfXLogRecord, total_len);
+		if (!ValidXLogRecordLength(state, RecPtr, record))
 			goto err;
-		}
+
 		gotheader = false;
 	}
 
+	total_len = record->xl_tot_len;
+
 	/*
 	 * Find space to decode this record.  Don't allow oversized allocation if
 	 * the caller requested nonblocking.  Otherwise, we *have* to try to
@@ -1106,16 +1102,47 @@ XLogReaderInvalReadState(XLogReaderState *state)
 }
 
 /*
- * Validate an XLOG record header.
+ * Validate record length of an XLOG record header.
  *
- * This is just a convenience subroutine to avoid duplicated code in
- * XLogReadRecord.  It's not intended for use from anywhere else.
+ * This is substantially a part of ValidXLogRecordHeader.  But

Re: Make mesage at end-of-recovery less scary.

2022-09-26 Thread Kyotaro Horiguchi

At Fri, 16 Sep 2022 23:21:50 -0500, Justin Pryzby  wrote 
in 
> @cfbot: rebased over adb466150, which did the same thing as one of the
> hunks in xlogreader.c.

Oops. Thanks! And then this gets a further conflict (param names
harmonization). So further rebased.  And removed an extra blank line
you pointed.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From b70a33bd941e9845106bea502db30d32e0138251 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Thu, 7 Jul 2022 11:51:45 +0900
Subject: [PATCH v21] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlogreader.c   | 136 +-
 src/backend/access/transam/xlogrecovery.c |  94 +++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 ++-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 106 +
 6 files changed, 298 insertions(+), 59 deletions(-)

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 4d6c34e0fc..b03eeb1487 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -48,6 +48,8 @@ static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
 static XLogPageReadResult XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking);
+static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+  XLogRecord *record);
 static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
   XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
 static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
@@ -149,6 +151,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -558,6 +561,7 @@ XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking)
 	/* reset error state */
 	state->errormsg_buf[0] = '\0';
 	decoded = NULL;
+	state->EndOfWAL = false;
 
 	state->abortedRecPtr = InvalidXLogRecPtr;
 	state->missingContrecPtr = InvalidXLogRecPtr;
@@ -640,25 +644,21 @@ restart:
 	Assert(pageHeaderSize <= readOff);
 
 	/*
-	 * Read the record length.
+	 * Validate the record header.
 	 *
-	 * NB: Even though we use an XLogRecord pointer here, the whole record
-	 * header might not fit on this page. xl_tot_len is the first field of the
-	 * struct, so it must be on this page (the records are MAXALIGNed), but we
-	 * cannot access any other fields until we've verified that we got the
-	 * whole header.
-	 */
-	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
-	total_len = record->xl_tot_len;
-
-	/*
-	 * If the whole record header is on this page, validate it immediately.
-	 * Otherwise do just a basic sanity check on xl_tot_len, and validate the
-	 * rest of the header after reading it from the next page.  The xl_tot_len
+	 * Even though we use an XLogRecord pointer here, the whole record header
+	 * might not fit on this page.  If the whole record header is on this page,
+	 * validate it immediately.  Even otherwise xl_tot_len must be on this page
+	 * (it is the first field of MAXALIGNed records), but we still cannot
+	 * access any further fields until we've verified that we got the whole
+	 * header, so do just a basic sanity check on record length, and validate
+	 * the rest of the header after reading it from the next page.  The length
 	 * check is necessary here to ensure that we enter the "Need to reassemble
 	 * record" code path below; otherwise we might fail to apply
 	 * ValidXLogRecordHeader at all.
 	 */
+	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
+
 	if (targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord)
 	{
 		if (!ValidXLogRecordHeader(state, RecPtr, state->DecodeRecPtr, record,
@@ -668,18 +668,14 @@ restart:
 	}
 	else
 	{
-		/* XXX: more validation should be done here */
-		if (total_len < SizeOfXLogRecord)
-		{
-			report_invalid_record(state,
-  "invalid record length at %X/%X: wanted %u, got %u",
-  LSN_FORMAT_ARGS(RecPtr),
-  (uint32) SizeOfXLogRecord, total_len);
+		if (!ValidXLogRecordLength(state, RecPtr, record))
 			goto err;
-		}
+
 		gotheader = false;
 	}
 
+	total_len = record->xl_tot_len;
+
 	/*
 	 * Find space to decode this record.  Don't allow oversized allocation if
 	 * the caller requested nonblocking.  Otherwise, we *have* to try to
@@ -1105,6

Re: Make mesage at end-of-recovery less scary.

2022-09-16 Thread Justin Pryzby

@cfbot: rebased over adb466150, which did the same thing as one of the
hunks in xlogreader.c.
>From c4069bb7181b68d742d2025567f859e69d24f513 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Thu, 7 Jul 2022 11:51:45 +0900
Subject: [PATCH] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlogreader.c   | 134 +-
 src/backend/access/transam/xlogrecovery.c |  95 +++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 ++-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 106 +
 6 files changed, 298 insertions(+), 58 deletions(-)

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 050d2f424e4..9b8f29d0ad0 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -48,6 +48,8 @@ static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
 static XLogPageReadResult XLogDecodeNextRecord(XLogReaderState *state, bool non_blocking);
+static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+  XLogRecord *record);
 static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
   XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
 static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
@@ -149,6 +151,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -558,6 +561,7 @@ XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking)
 	/* reset error state */
 	state->errormsg_buf[0] = '\0';
 	decoded = NULL;
+	state->EndOfWAL = false;
 
 	state->abortedRecPtr = InvalidXLogRecPtr;
 	state->missingContrecPtr = InvalidXLogRecPtr;
@@ -640,25 +644,21 @@ restart:
 	Assert(pageHeaderSize <= readOff);
 
 	/*
-	 * Read the record length.
+	 * Validate the record header.
 	 *
-	 * NB: Even though we use an XLogRecord pointer here, the whole record
-	 * header might not fit on this page. xl_tot_len is the first field of the
-	 * struct, so it must be on this page (the records are MAXALIGNed), but we
-	 * cannot access any other fields until we've verified that we got the
-	 * whole header.
-	 */
-	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
-	total_len = record->xl_tot_len;
-
-	/*
-	 * If the whole record header is on this page, validate it immediately.
-	 * Otherwise do just a basic sanity check on xl_tot_len, and validate the
-	 * rest of the header after reading it from the next page.  The xl_tot_len
+	 * Even though we use an XLogRecord pointer here, the whole record header
+	 * might not fit on this page.  If the whole record header is on this page,
+	 * validate it immediately.  Even otherwise xl_tot_len must be on this page
+	 * (it is the first field of MAXALIGNed records), but we still cannot
+	 * access any further fields until we've verified that we got the whole
+	 * header, so do just a basic sanity check on record length, and validate
+	 * the rest of the header after reading it from the next page.  The length
 	 * check is necessary here to ensure that we enter the "Need to reassemble
 	 * record" code path below; otherwise we might fail to apply
 	 * ValidXLogRecordHeader at all.
 	 */
+	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
+
 	if (targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord)
 	{
 		if (!ValidXLogRecordHeader(state, RecPtr, state->DecodeRecPtr, record,
@@ -668,18 +668,14 @@ restart:
 	}
 	else
 	{
-		/* XXX: more validation should be done here */
-		if (total_len < SizeOfXLogRecord)
-		{
-			report_invalid_record(state,
-  "invalid record length at %X/%X: wanted %u, got %u",
-  LSN_FORMAT_ARGS(RecPtr),
-  (uint32) SizeOfXLogRecord, total_len);
+		if (!ValidXLogRecordLength(state, RecPtr, record))
 			goto err;
-		}
+
 		gotheader = false;
 	}
 
+	total_len = record->xl_tot_len;
+
 	/*
 	 * Find space to decode this record.  Don't allow oversized allocation if
 	 * the caller requested nonblocking.  Otherwise, we *have* to try to
@@ -1106,16 +1102,47 @@ XLogReaderInvalReadState(XLogReaderState *state)
 }
 
 /*
- * Validate an XLOG record header.
+ * Validate record length of an XLOG record header.
  *
- * This is just a convenience subroutine to avoid duplicated code in
- * XLogReadRecord.  It's not intended for

Re: Make mesage at end-of-recovery less scary.

2022-07-07 Thread Kyotaro Horiguchi

At Wed, 6 Jul 2022 11:05:51 -0700, Jacob Champion  
wrote in 
> On Mon, Mar 28, 2022 at 11:07 PM Kyotaro Horiguchi
>  wrote:
> >
> > Rebased.
> 
> Unfortunately this will need another rebase over latest.

Thanks! Done. 

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From ff8a069ca587c55e06f0700492edc8fc16d15138 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Thu, 7 Jul 2022 11:51:45 +0900
Subject: [PATCH v19] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlogreader.c   | 145 +-
 src/backend/access/transam/xlogrecovery.c |  95 ++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 +-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 106 
 6 files changed, 308 insertions(+), 59 deletions(-)

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index f3dc4b7797..a86ab2b02b 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -48,6 +48,8 @@ static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
 static XLogPageReadResult XLogDecodeNextRecord(XLogReaderState *state, bool non_blocking);
+static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+  XLogRecord *record);
 static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
   XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
 static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
@@ -149,6 +151,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -554,6 +557,7 @@ XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking)
 	/* reset error state */
 	state->errormsg_buf[0] = '\0';
 	decoded = NULL;
+	state->EndOfWAL = false;
 
 	state->abortedRecPtr = InvalidXLogRecPtr;
 	state->missingContrecPtr = InvalidXLogRecPtr;
@@ -635,25 +639,21 @@ restart:
 	Assert(pageHeaderSize <= readOff);
 
 	/*
-	 * Read the record length.
+	 * Validate the record header.
 	 *
-	 * NB: Even though we use an XLogRecord pointer here, the whole record
-	 * header might not fit on this page. xl_tot_len is the first field of the
-	 * struct, so it must be on this page (the records are MAXALIGNed), but we
-	 * cannot access any other fields until we've verified that we got the
-	 * whole header.
-	 */
-	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
-	total_len = record->xl_tot_len;
-
-	/*
-	 * If the whole record header is on this page, validate it immediately.
-	 * Otherwise do just a basic sanity check on xl_tot_len, and validate the
-	 * rest of the header after reading it from the next page.  The xl_tot_len
+	 * Even though we use an XLogRecord pointer here, the whole record header
+	 * might not fit on this page.  If the whole record header is on this page,
+	 * validate it immediately.  Even otherwise xl_tot_len must be on this page
+	 * (it is the first field of MAXALIGNed records), but we still cannot
+	 * access any further fields until we've verified that we got the whole
+	 * header, so do just a basic sanity check on record length, and validate
+	 * the rest of the header after reading it from the next page.  The length
 	 * check is necessary here to ensure that we enter the "Need to reassemble
 	 * record" code path below; otherwise we might fail to apply
 	 * ValidXLogRecordHeader at all.
 	 */
+	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
+
 	if (targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord)
 	{
 		if (!ValidXLogRecordHeader(state, RecPtr, state->DecodeRecPtr, record,
@@ -663,18 +663,14 @@ restart:
 	}
 	else
 	{
-		/* XXX: more validation should be done here */
-		if (total_len < SizeOfXLogRecord)
-		{
-			report_invalid_record(state,
-  "invalid record length at %X/%X: wanted %u, got %u",
-  LSN_FORMAT_ARGS(RecPtr),
-  (uint32) SizeOfXLogRecord, total_len);
+		if (!ValidXLogRecordLength(state, RecPtr, record))
 			goto err;
-		}
+
 		gotheader = false;
 	}
 
+	total_len = record->xl_tot_len;
+
 	/*
 	 * Find space to decode this record.  Don't allow oversized allocation if
 	 * the caller requested nonblocking.  Otherwise, we *have* to try to
@@ -906,6 +902,15 @@ err:
 		 */
 		state->abortedRecPtr = RecPtr;
 		state->missingContrecPtr =

Re: Make mesage at end-of-recovery less scary.

2022-07-06 Thread Michael Paquier

On Wed, Jul 06, 2022 at 11:05:51AM -0700, Jacob Champion wrote:
> [CFM hat] Looking through the history here, this has been bumped to
> Ready for Committer a few times and then bumped back to Needs Review
> after a required rebase. What's the best way for us to provide support
> for contributors who get stuck in this loop? Maybe we can be more
> aggressive about automated notifications when a RfC patch goes red in
> the cfbot?

Having a better integration between the CF bot and the CF app would be
great, IMO.  People tend to easily forget about what they send in my
experience, even if they manage a small pool of patches or a larger
one.
--
Michael


signature.asc
Description: PGP signature

Re: Make mesage at end-of-recovery less scary.

2022-07-06 Thread Jacob Champion

On Mon, Mar 28, 2022 at 11:07 PM Kyotaro Horiguchi
 wrote:
>
> Rebased.

Unfortunately this will need another rebase over latest.

[CFM hat] Looking through the history here, this has been bumped to
Ready for Committer a few times and then bumped back to Needs Review
after a required rebase. What's the best way for us to provide support
for contributors who get stuck in this loop? Maybe we can be more
aggressive about automated notifications when a RfC patch goes red in
the cfbot?

Thanks,
--Jacob

Re: Make mesage at end-of-recovery less scary.

2022-03-29 Thread Kyotaro Horiguchi

me> Rebased to the current HEAD.

b64c3bd62e (removal of unused "use Config") conflicted on a TAP
script.

Rebased.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From e1492816913efcb4fc25ee6a3bafd27a6c5f3f9a Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 28 Feb 2020 15:52:58 +0900
Subject: [PATCH v18] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlogreader.c   | 145 +-
 src/backend/access/transam/xlogrecovery.c |  92 ++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 +-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 106 
 6 files changed, 306 insertions(+), 58 deletions(-)

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index e437c42992..0942265408 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -46,6 +46,8 @@ static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
 static XLogPageReadResult XLogDecodeNextRecord(XLogReaderState *state, bool non_blocking);
+static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+  XLogRecord *record);
 static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
   XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
 static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
@@ -147,6 +149,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -552,6 +555,7 @@ XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking)
 	/* reset error state */
 	state->errormsg_buf[0] = '\0';
 	decoded = NULL;
+	state->EndOfWAL = false;
 
 	state->abortedRecPtr = InvalidXLogRecPtr;
 	state->missingContrecPtr = InvalidXLogRecPtr;
@@ -633,25 +637,21 @@ restart:
 	Assert(pageHeaderSize <= readOff);
 
 	/*
-	 * Read the record length.
+	 * Validate the record header.
 	 *
-	 * NB: Even though we use an XLogRecord pointer here, the whole record
-	 * header might not fit on this page. xl_tot_len is the first field of the
-	 * struct, so it must be on this page (the records are MAXALIGNed), but we
-	 * cannot access any other fields until we've verified that we got the
-	 * whole header.
-	 */
-	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
-	total_len = record->xl_tot_len;
-
-	/*
-	 * If the whole record header is on this page, validate it immediately.
-	 * Otherwise do just a basic sanity check on xl_tot_len, and validate the
-	 * rest of the header after reading it from the next page.  The xl_tot_len
+	 * Even though we use an XLogRecord pointer here, the whole record header
+	 * might not fit on this page.  If the whole record header is on this page,
+	 * validate it immediately.  Even otherwise xl_tot_len must be on this page
+	 * (it is the first field of MAXALIGNed records), but we still cannot
+	 * access any further fields until we've verified that we got the whole
+	 * header, so do just a basic sanity check on record length, and validate
+	 * the rest of the header after reading it from the next page.  The length
 	 * check is necessary here to ensure that we enter the "Need to reassemble
 	 * record" code path below; otherwise we might fail to apply
 	 * ValidXLogRecordHeader at all.
 	 */
+	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
+
 	if (targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord)
 	{
 		if (!ValidXLogRecordHeader(state, RecPtr, state->DecodeRecPtr, record,
@@ -661,18 +661,14 @@ restart:
 	}
 	else
 	{
-		/* XXX: more validation should be done here */
-		if (total_len < SizeOfXLogRecord)
-		{
-			report_invalid_record(state,
-  "invalid record length at %X/%X: wanted %u, got %u",
-  LSN_FORMAT_ARGS(RecPtr),
-  (uint32) SizeOfXLogRecord, total_len);
+		if (!ValidXLogRecordLength(state, RecPtr, record))
 			goto err;
-		}
+
 		gotheader = false;
 	}
 
+	total_len = record->xl_tot_len;
+
 	/*
 	 * Find space to decode this record.  Don't allow oversized allocation if
 	 * the caller requested nonblocking.  Otherwise, we *have* to try to
@@ -904,6 +900,15 @@ err:
 		 */
 		state->abortedRecPtr = RecPtr;
 		state->missingContrecPtr = targetPagePtr;
+
+		/*
+		 * If the message is not set yet, that means we failed to load the
+		 * page for the

Re: Make mesage at end-of-recovery less scary.

2022-03-21 Thread Kyotaro Horiguchi

At Mon, 21 Mar 2022 17:01:19 -0700, Andres Freund  wrote in 
> Patch currently fails to apply, needs a rebase:
> http://cfbot.cputube.org/patch_37_2490.log

Thanks for noticing me of that.

Rebased to the current HEAD.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From a7c9f36e631eaba5078398598dae5d459e79add9 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 28 Feb 2020 15:52:58 +0900
Subject: [PATCH v17] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlogreader.c   | 145 +-
 src/backend/access/transam/xlogrecovery.c |  92 ++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 +-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 106 
 6 files changed, 306 insertions(+), 58 deletions(-)

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index e437c42992..0942265408 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -46,6 +46,8 @@ static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
 static XLogPageReadResult XLogDecodeNextRecord(XLogReaderState *state, bool non_blocking);
+static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+  XLogRecord *record);
 static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
   XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
 static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
@@ -147,6 +149,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -552,6 +555,7 @@ XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking)
 	/* reset error state */
 	state->errormsg_buf[0] = '\0';
 	decoded = NULL;
+	state->EndOfWAL = false;
 
 	state->abortedRecPtr = InvalidXLogRecPtr;
 	state->missingContrecPtr = InvalidXLogRecPtr;
@@ -633,25 +637,21 @@ restart:
 	Assert(pageHeaderSize <= readOff);
 
 	/*
-	 * Read the record length.
+	 * Validate the record header.
 	 *
-	 * NB: Even though we use an XLogRecord pointer here, the whole record
-	 * header might not fit on this page. xl_tot_len is the first field of the
-	 * struct, so it must be on this page (the records are MAXALIGNed), but we
-	 * cannot access any other fields until we've verified that we got the
-	 * whole header.
-	 */
-	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
-	total_len = record->xl_tot_len;
-
-	/*
-	 * If the whole record header is on this page, validate it immediately.
-	 * Otherwise do just a basic sanity check on xl_tot_len, and validate the
-	 * rest of the header after reading it from the next page.  The xl_tot_len
+	 * Even though we use an XLogRecord pointer here, the whole record header
+	 * might not fit on this page.  If the whole record header is on this page,
+	 * validate it immediately.  Even otherwise xl_tot_len must be on this page
+	 * (it is the first field of MAXALIGNed records), but we still cannot
+	 * access any further fields until we've verified that we got the whole
+	 * header, so do just a basic sanity check on record length, and validate
+	 * the rest of the header after reading it from the next page.  The length
 	 * check is necessary here to ensure that we enter the "Need to reassemble
 	 * record" code path below; otherwise we might fail to apply
 	 * ValidXLogRecordHeader at all.
 	 */
+	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
+
 	if (targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord)
 	{
 		if (!ValidXLogRecordHeader(state, RecPtr, state->DecodeRecPtr, record,
@@ -661,18 +661,14 @@ restart:
 	}
 	else
 	{
-		/* XXX: more validation should be done here */
-		if (total_len < SizeOfXLogRecord)
-		{
-			report_invalid_record(state,
-  "invalid record length at %X/%X: wanted %u, got %u",
-  LSN_FORMAT_ARGS(RecPtr),
-  (uint32) SizeOfXLogRecord, total_len);
+		if (!ValidXLogRecordLength(state, RecPtr, record))
 			goto err;
-		}
+
 		gotheader = false;
 	}
 
+	total_len = record->xl_tot_len;
+
 	/*
 	 * Find space to decode this record.  Don't allow oversized allocation if
 	 * the caller requested nonblocking.  Otherwise, we *have* to try to
@@ -904,6 +900,15 @@ err:
 		 */
 		state->abortedRecPtr = RecPtr;
 		state->missingContrecPtr =

Re: Make mesage at end-of-recovery less scary.

2022-03-21 Thread Andres Freund

Hi,

On 2022-03-04 09:43:59 +0900, Kyotaro Horiguchi wrote:
> On second thought the two seems repeating the same things.  Thus I
> merged the two comments together.  In this verion 16 it looks like
> this.

Patch currently fails to apply, needs a rebase:
http://cfbot.cputube.org/patch_37_2490.log

Greetings,

Andres Freund

Re: Make mesage at end-of-recovery less scary.

2022-03-03 Thread Kyotaro Horiguchi

At Thu, 3 Mar 2022 15:39:44 +0530, Ashutosh Sharma  
wrote in 
> The new changes made in the patch look good. Thanks to the recent
> changes to speed WAL insertion that have helped us catch this bug.

Thanks for the quick checking.

> One small comment:
> 
> record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
> -   total_len = record->xl_tot_len;
> 
> Do you think we need to change the position of the comments written
> for above code that says:

Yeah, I didn't do that since it is about header verification.  But as
you pointed, the result still doesn't look perfect.

On second thought the two seems repeating the same things.  Thus I
merged the two comments together.  In this verion 16 it looks like
this.

>   /*
>* Validate the record header.
>*
>* Even though we use an XLogRecord pointer here, the whole record 
> header
>* might not fit on this page.  If the whole record header is on this 
> page,
>* validate it immediately.  Even otherwise xl_tot_len must be on this 
> page
>* (it is the first field of MAXALIGNed records), but we still cannot
>* access any further fields until we've verified that we got the whole
>* header, so do just a basic sanity check on record length, and 
> validate
>* the rest of the header after reading it from the next page.  The 
> length
>* check is necessary here to ensure that we enter the "Need to 
> reassemble
>* record" code path below; otherwise we might fail to apply
>* ValidXLogRecordHeader at all.
>*/
>   record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
>
>   if (targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 00d848df6bb8b9966dfbd39c98a388fda42a3e3c Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 28 Feb 2020 15:52:58 +0900
Subject: [PATCH v16] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlogreader.c   | 144 +-
 src/backend/access/transam/xlogrecovery.c |  92 ++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 +-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 106 
 6 files changed, 305 insertions(+), 58 deletions(-)

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 35029cf97d..bd0f211a23 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -42,6 +42,8 @@ static bool allocate_recordbuf(XLogReaderState *state, uint32 reclength);
 static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
+static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+  XLogRecord *record);
 static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
   XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
 static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
@@ -121,6 +123,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -292,6 +295,7 @@ XLogReadRecord(XLogReaderState *state, char **errormsg)
 	/* reset error state */
 	*errormsg = NULL;
 	state->errormsg_buf[0] = '\0';
+	state->EndOfWAL = false;
 
 	ResetDecoder(state);
 	state->abortedRecPtr = InvalidXLogRecPtr;
@@ -371,25 +375,21 @@ restart:
 	Assert(pageHeaderSize <= readOff);
 
 	/*
-	 * Read the record length.
+	 * Validate the record header.
 	 *
-	 * NB: Even though we use an XLogRecord pointer here, the whole record
-	 * header might not fit on this page. xl_tot_len is the first field of the
-	 * struct, so it must be on this page (the records are MAXALIGNed), but we
-	 * cannot access any other fields until we've verified that we got the
-	 * whole header.
-	 */
-	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
-	total_len = record->xl_tot_len;
-
-	/*
-	 * If the whole record header is on this page, validate it immediately.
-	 * Otherwise do just a basic sanity check on xl_tot_len, and validate the
-	 * rest of the header after reading it from the next page.  The xl_tot_len
+	 * Even though we use an XLogRecord pointer here, the whole record header
+	 * might not fit on this page.  If the whole record header is on this page,
+	 *

Re: Make mesage at end-of-recovery less scary.

2022-03-03 Thread Ashutosh Sharma

On Wed, Mar 2, 2022 at 7:47 AM Kyotaro Horiguchi
 wrote:
>
> At Sat, 19 Feb 2022 09:31:33 +0530, Ashutosh Sharma  
> wrote in
> > The changes looks good. thanks.!
>
> Thanks!
>
> Some recent core change changed WAL insertion speed during the TAP
> test and revealed one forgotton case of EndOfWAL.  When a record
> header flows into the next page, XLogReadRecord does separate check
> from ValidXLogRecordHeader by itself.
>

The new changes made in the patch look good. Thanks to the recent
changes to speed WAL insertion that have helped us catch this bug.

One small comment:

record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
-   total_len = record->xl_tot_len;

Do you think we need to change the position of the comments written
for above code that says:

/*
 * Read the record length.
 *
...
...

--
With Regards,
Ashutosh Sharma.

Re: Make mesage at end-of-recovery less scary.

2022-03-01 Thread Kyotaro Horiguchi

At Sat, 19 Feb 2022 09:31:33 +0530, Ashutosh Sharma  
wrote in 
> The changes looks good. thanks.!

Thanks!

Some recent core change changed WAL insertion speed during the TAP
test and revealed one forgotton case of EndOfWAL.  When a record
header flows into the next page, XLogReadRecord does separate check
from ValidXLogRecordHeader by itself.

>* If the whole record header is on this page, validate it immediately.
>* Otherwise do just a basic sanity check on xl_tot_len, and validate 
> the
>* rest of the header after reading it from the next page.  The 
> xl_tot_len
>* check is necessary here to ensure that we enter the "Need to 
> reassemble
>* record" code path below; otherwise we might fail to apply
>* ValidXLogRecordHeader at all.
>*/
>   if (targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord)
>   {
...
>}
>   else
>   {
>   /* XXX: more validation should be done here */
>   if (total_len < SizeOfXLogRecord)
>   {

I could simplly copy-in a part of ValidXLogRecordHeader there but that
results in rather large duplicate code. I could have
ValidXLogRecordHeader handle the partial header case but it seems to
me complex.

So in this version I split the xl_tot_len part of
ValidXLogRecordHeader into ValidXLogRecordLength.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 01cce076d2b3ad536398cc2b716ef64ed9b2c409 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 28 Feb 2020 15:52:58 +0900
Subject: [PATCH v15] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlogreader.c   | 125 ++
 src/backend/access/transam/xlogrecovery.c |  92 
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 ++-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 106 ++
 6 files changed, 297 insertions(+), 47 deletions(-)

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 35029cf97d..ba1c1ece87 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -42,6 +42,8 @@ static bool allocate_recordbuf(XLogReaderState *state, uint32 reclength);
 static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
+static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+  XLogRecord *record);
 static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
   XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
 static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
@@ -121,6 +123,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -292,6 +295,7 @@ XLogReadRecord(XLogReaderState *state, char **errormsg)
 	/* reset error state */
 	*errormsg = NULL;
 	state->errormsg_buf[0] = '\0';
+	state->EndOfWAL = false;
 
 	ResetDecoder(state);
 	state->abortedRecPtr = InvalidXLogRecPtr;
@@ -380,12 +384,11 @@ restart:
 	 * whole header.
 	 */
 	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
-	total_len = record->xl_tot_len;
 
 	/*
 	 * If the whole record header is on this page, validate it immediately.
-	 * Otherwise do just a basic sanity check on xl_tot_len, and validate the
-	 * rest of the header after reading it from the next page.  The xl_tot_len
+	 * Otherwise do just a basic sanity check on record length, and validate
+	 * the rest of the header after reading it from the next page.  The length
 	 * check is necessary here to ensure that we enter the "Need to reassemble
 	 * record" code path below; otherwise we might fail to apply
 	 * ValidXLogRecordHeader at all.
@@ -399,18 +402,13 @@ restart:
 	}
 	else
 	{
-		/* XXX: more validation should be done here */
-		if (total_len < SizeOfXLogRecord)
-		{
-			report_invalid_record(state,
-  "invalid record length at %X/%X: wanted %u, got %u",
-  LSN_FORMAT_ARGS(RecPtr),
-  (uint32) SizeOfXLogRecord, total_len);
+		if (!ValidXLogRecordLength(state, RecPtr, record))
 			goto err;
-		}
+
 		gotheader = false;
 	}
 
+	total_len = record->xl_tot_len;
 	len = XLOG_BLCKSZ - RecPtr % XLOG_BLCKSZ;
 	if (total_len > len)
 	{
@@ -588,6 +586,15 @@ err:
 		 */
 		state->abortedRecPtr = RecPtr;

Re: Make mesage at end-of-recovery less scary.

2022-02-18 Thread Ashutosh Sharma

On Thu, Feb 17, 2022 at 1:20 PM Kyotaro Horiguchi
 wrote:
>
> At Tue, 15 Feb 2022 20:17:20 +0530, Ashutosh Sharma  
> wrote in
> > OK. The v13 patch looks good. I have marked it as ready to commit.
> > Thank you for working on all my review comments.
>
> Thaks! But the recent xlog.c refactoring crashes into this patch.
> And I found a silly bug while rebasing.
>
> xlog.c:12463 / xlogrecovery.c:3168
> if (!WaitForWALToBecomeAvailable(targetPagePtr + reqLen,
> ..
> {
> +   Assert(!StandbyMode);
> ...
> +   xlogreader->EndOfWAL = true;
>
> Yeah, I forgot about promotion there..

Yes, we exit WaitForWALToBecomeAvailable() even in standby mode
provided the user has requested for the promotion. So checking for the
!StandbyMode condition alone was not enough.

So what I should have done is
> setting EndOfWAL according to StandbyMode.
>
> +   Assert(!StandbyMode || CheckForStandbyTrigger());
> ...
> +   /* promotion exit is not end-of-WAL */
> +   xlogreader->EndOfWAL = !StandbyMode;
>

The changes looks good. thanks.!

--
With Regards,
Ashutosh Sharma.

Re: Make mesage at end-of-recovery less scary.

2022-02-17 Thread Ashutosh Sharma

On Thu, Feb 17, 2022 at 1:20 PM Kyotaro Horiguchi
 wrote:
>
> At Tue, 15 Feb 2022 20:17:20 +0530, Ashutosh Sharma  
> wrote in
> > OK. The v13 patch looks good. I have marked it as ready to commit.
> > Thank you for working on all my review comments.
>
> Thaks! But the recent xlog.c refactoring crashes into this patch.
> And I found a silly bug while rebasing.
>

Thanks.! I'll take a look at the new changes.

--
With Regards,
Ashutosh Sharma.

Re: Make mesage at end-of-recovery less scary.

2022-02-16 Thread Kyotaro Horiguchi

At Tue, 15 Feb 2022 20:17:20 +0530, Ashutosh Sharma  
wrote in 
> OK. The v13 patch looks good. I have marked it as ready to commit.
> Thank you for working on all my review comments.

Thaks! But the recent xlog.c refactoring crashes into this patch.
And I found a silly bug while rebasing.

xlog.c:12463 / xlogrecovery.c:3168
if (!WaitForWALToBecomeAvailable(targetPagePtr + reqLen,
..
{
+   Assert(!StandbyMode);
...
+   xlogreader->EndOfWAL = true;

Yeah, I forgot about promotion there..  So what I should have done is
setting EndOfWAL according to StandbyMode.

+   Assert(!StandbyMode || CheckForStandbyTrigger());
...
+   /* promotion exit is not end-of-WAL */
+   xlogreader->EndOfWAL = !StandbyMode;

The rebased v14 is attached.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 5613ee80a4d2a9786f5ce8421dcbb560b63a13c1 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 28 Feb 2020 15:52:58 +0900
Subject: [PATCH v14] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlogreader.c   |  78 
 src/backend/access/transam/xlogrecovery.c |  92 ++-
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 ++-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 106 ++
 6 files changed, 268 insertions(+), 29 deletions(-)

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 35029cf97d..22982c4de7 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -121,6 +121,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -292,6 +293,7 @@ XLogReadRecord(XLogReaderState *state, char **errormsg)
 	/* reset error state */
 	*errormsg = NULL;
 	state->errormsg_buf[0] = '\0';
+	state->EndOfWAL = false;
 
 	ResetDecoder(state);
 	state->abortedRecPtr = InvalidXLogRecPtr;
@@ -588,6 +590,15 @@ err:
 		 */
 		state->abortedRecPtr = RecPtr;
 		state->missingContrecPtr = targetPagePtr;
+
+		/*
+		 * If the message is not set yet, that means we failed to load the
+		 * page for the record.  Otherwise do not hide the existing message.
+		 */
+		if (state->errormsg_buf[0] == '\0')
+			report_invalid_record(state,
+  "missing contrecord at %X/%X",
+  LSN_FORMAT_ARGS(RecPtr));
 	}
 
 	/*
@@ -730,6 +741,40 @@ ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
 	  XLogRecPtr PrevRecPtr, XLogRecord *record,
 	  bool randAccess)
 {
+	if (record->xl_tot_len == 0)
+	{
+		/*
+		 * We are almost sure reaching the end of WAL, make sure that the
+		 * whole page after the record is filled with zeroes.
+		 */
+		char	   *p;
+		char	   *pe;
+
+		/* scan from the beginning of the record to the end of block */
+		p = (char *) record;
+		pe = p + XLOG_BLCKSZ - (RecPtr & (XLOG_BLCKSZ - 1));
+
+		while (*p == 0 && p < pe)
+			p++;
+
+		if (p == pe)
+		{
+			/*
+			 * The page after the record is completely zeroed. That suggests
+			 * we don't have a record after this point. We don't bother
+			 * checking the pages after since they are not zeroed in the case
+			 * of recycled segments.
+			 */
+			report_invalid_record(state, "empty record at %X/%X",
+  LSN_FORMAT_ARGS(RecPtr));
+
+			/* notify end-of-wal to callers */
+			state->EndOfWAL = true;
+			return false;
+		}
+
+		/* The same condition will be caught as invalid record length */
+	}
 	if (record->xl_tot_len < SizeOfXLogRecord)
 	{
 		report_invalid_record(state,
@@ -836,6 +881,31 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 
 	XLogSegNoOffsetToRecPtr(segno, offset, state->segcxt.ws_segsize, recaddr);
 
+	StaticAssertStmt(XLOG_PAGE_MAGIC != 0, "XLOG_PAGE_MAGIC is zero");
+
+	if (hdr->xlp_magic == 0)
+	{
+		/* Regard an empty page as End-Of-WAL */
+		int			i;
+
+		for (i = 0; i < XLOG_BLCKSZ && phdr[i] == 0; i++);
+		if (i == XLOG_BLCKSZ)
+		{
+			char		fname[MAXFNAMELEN];
+
+			XLogFileName(fname, state->seg.ws_tli, segno,
+		 state->segcxt.ws_segsize);
+
+			report_invalid_record(state,
+  "empty page in log segment %s, offset %u",
+  fname,
+  offset);
+			state->EndOfWAL = true;
+			return false;
+		}
+
+		/* The same condition will be caught as invalid magic number */
+	}
 	if (hdr->xlp_magic !=

Re: Make mesage at end-of-recovery less scary.

2022-02-15 Thread Ashutosh Sharma

On Tue, Feb 15, 2022 at 7:52 AM Kyotaro Horiguchi
 wrote:
>
> At Mon, 14 Feb 2022 20:14:11 +0530, Ashutosh Sharma  
> wrote in
> > No, I haven't tried to compare archive recovery to PITR or vice versa,
> > instead I was trying to compare crash recovery with PITR. The message
> > you're emitting says just before entering into the archive recovery is
> > - "reached end-of-WAL on ... in pg_wal *during crash recovery*,
> > entering archive recovery".  This message is static and can be emitted
> > not only during crash recovery, but also during PITR. I think we can
>
> No. It is emitted *only* after crash recovery before starting archive
> recovery.  Another message this patch adds can be emitted after PITR
> or archive recovery.
>
> > not only during crash recovery, but also during PITR. I think we can
> > remove the "during crash recovery" part from this message, so "reached
> > the end of WAL at %X/%X on timeline %u in %s, entering archive
>
> What makes you think it can be emitted after other than crash recovery?
> (Please look at the code comment just above.)
>

Yep that's right. We won't be coming here in case of pitr.

> > recovery". Also I don't think we need format specifier %s here, it can
> > be hard-coded with pg_wal as in this case we can only enter archive
> > recovery after reading wal from pg_wal, so current WAL source has to
> > be pg_wal, isn't it?
>
> You're right that it can't be other than pg_wal.  It was changed just
> in accordance woth another message this patch adds and it would be a
> matter of taste.  I replaced to "pg_wal" in this version.
>

OK. I have verified the changes.

> > Thanks for the changes. Please note that I am not able to apply the
> > latest patch on HEAD. Could you please rebase it on HEAD and share the
> > new version. Thank you.
>
> A change on TAP script hit this.  The v13 attached is:
>

OK. The v13 patch looks good. I have marked it as ready to commit.
Thank you for working on all my review comments.

--
With Regards,
Ashutosh Sharma.

Re: Make mesage at end-of-recovery less scary.

2022-02-14 Thread Kyotaro Horiguchi

At Mon, 14 Feb 2022 20:14:11 +0530, Ashutosh Sharma  
wrote in 
> No, I haven't tried to compare archive recovery to PITR or vice versa,
> instead I was trying to compare crash recovery with PITR. The message
> you're emitting says just before entering into the archive recovery is
> - "reached end-of-WAL on ... in pg_wal *during crash recovery*,
> entering archive recovery".  This message is static and can be emitted
> not only during crash recovery, but also during PITR. I think we can

No. It is emitted *only* after crash recovery before starting archive
recovery.  Another message this patch adds can be emitted after PITR
or archive recovery.

> not only during crash recovery, but also during PITR. I think we can
> remove the "during crash recovery" part from this message, so "reached
> the end of WAL at %X/%X on timeline %u in %s, entering archive

What makes you think it can be emitted after other than crash recovery?
(Please look at the code comment just above.)

> recovery". Also I don't think we need format specifier %s here, it can
> be hard-coded with pg_wal as in this case we can only enter archive
> recovery after reading wal from pg_wal, so current WAL source has to
> be pg_wal, isn't it?

You're right that it can't be other than pg_wal.  It was changed just
in accordance woth another message this patch adds and it would be a
matter of taste.  I replaced to "pg_wal" in this version.

> Thanks for the changes. Please note that I am not able to apply the
> latest patch on HEAD. Could you please rebase it on HEAD and share the
> new version. Thank you.

A change on TAP script hit this.  The v13 attached is:

- Rebased.

- Replaced "%s" in the debug transition message from crash recovery to
  archive recovery.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 311e862e87dbdeb6348c6fc17063308342359c02 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 28 Feb 2020 15:52:58 +0900
Subject: [PATCH v13] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlog.c |  91 ++-
 src/backend/access/transam/xlogreader.c   |  78 
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 ++-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 106 ++
 6 files changed, 266 insertions(+), 30 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 958220c495..bb7026ac77 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4480,6 +4480,7 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 	for (;;)
 	{
 		char	   *errormsg;
+		XLogRecPtr	ErrRecPtr = InvalidXLogRecPtr;
 
 		record = XLogReadRecord(xlogreader, );
 		if (record == NULL)
@@ -4495,6 +4496,18 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			{
 abortedRecPtr = xlogreader->abortedRecPtr;
 missingContrecPtr = xlogreader->missingContrecPtr;
+ErrRecPtr = abortedRecPtr;
+			}
+			else
+			{
+/*
+ * EndRecPtr is the LSN we tried to read but failed. In the
+ * case of decoding error, it is at the end of the failed
+ * record but we don't have a means for now to know EndRecPtr
+ * is pointing to which of the beginning or ending of the
+ * failed record.
+ */
+ErrRecPtr = xlogreader->EndRecPtr;
 			}
 
 			if (readFile >= 0)
@@ -4504,13 +4517,16 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			}
 
 			/*
-			 * We only end up here without a message when XLogPageRead()
-			 * failed - in that case we already logged something. In
-			 * StandbyMode that only happens if we have been triggered, so we
-			 * shouldn't loop anymore in that case.
+			 * We only end up here without a message when XLogPageRead() failed
+			 * in that case we already logged something, or just met end-of-WAL
+			 * conditions. In StandbyMode that only happens if we have been
+			 * triggered, so we shouldn't loop anymore in that case. When
+			 * EndOfWAL is true, we don't emit that error if any immediately
+			 * and instead will show it as a part of a decent end-of-wal
+			 * message later.
 			 */
-			if (errormsg)
-ereport(emode_for_corrupt_record(emode, xlogreader->EndRecPtr),
+			if (!xlogreader->EndOfWAL && errormsg)
+ereport(emode_for_corrupt_record(emode, ErrRecPtr),
 		(errmsg_internal("%s", errormsg) /* already translated */ ));
 		}
 
@@ -4541,11 +4557,14 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			/* Great, got a record */
 			return

Re: Make mesage at end-of-recovery less scary.

2022-02-14 Thread Ashutosh Sharma

Hi,

On Thu, Feb 10, 2022 at 11:47 AM Kyotaro Horiguchi
 wrote:
>
> At Wed, 9 Feb 2022 17:31:02 +0530, Ashutosh Sharma  
> wrote in
> > On Wed, Feb 9, 2022 at 1:14 PM Kyotaro Horiguchi
> >  wrote:
> > > This means archive-recovery is requested but not started yet. That is,
> > > we've just finished crash recovery.  The existing comment cited
> > > together is mentioning that.
> > >
> > > At the end of PITR (or archive recovery), the other code works.
> > >
> >
> > This is quite understandable, the point here is that the message that
> > we are emitting says, we have just finished reading the wal files in
> > the pg_wal directory during crash recovery and are now entering
> > archive recovery when we are actually doing point-in-time recovery
> > which seems a bit misleading.
>
> Here is the messages.
>
> > 2022-02-08 18:00:44.367 IST [86185] LOG:  starting point-in-time
> > recovery to WAL location (LSN) "0/5227790"
> > 2022-02-08 18:00:44.368 IST [86185] LOG:  database system was not
> > properly shut down; automatic recovery in progress
> > 2022-02-08 18:00:44.369 IST [86185] LOG:  redo starts at 0/14DC8D8
> > 2022-02-08 18:00:44.978 IST [86185] DEBUG1:  reached end of WAL at
> > 0/3D0 on timeline 1 in pg_wal during crash recovery, entering
> > archive recovery
>
> In the first place the last DEBUG1 is not on my part, but one of the
> messages added by this patch says the same thing.  Is your point that
> archive recovery is different thing from PITR?  In regard to the
> difference, I think PITR is a form of archive recovery.
>

No, I haven't tried to compare archive recovery to PITR or vice versa,
instead I was trying to compare crash recovery with PITR. The message
you're emitting says just before entering into the archive recovery is
- "reached end-of-WAL on ... in pg_wal *during crash recovery*,
entering archive recovery".  This message is static and can be emitted
not only during crash recovery, but also during PITR. I think we can
remove the "during crash recovery" part from this message, so "reached
the end of WAL at %X/%X on timeline %u in %s, entering archive
recovery". Also I don't think we need format specifier %s here, it can
be hard-coded with pg_wal as in this case we can only enter archive
recovery after reading wal from pg_wal, so current WAL source has to
be pg_wal, isn't it?

> That being said, after some thoughts on this, I changed my mind that
> we don't need to say what operation was being performed at the
> end-of-WAL.  So in the attached the end-of-WAL message is not
> accompanied by the kind of recovery.
>
> > LOG:  reached end of WAL at 0/300 on timeline 1
>
> I removed the archive-source part along with the operation mode.
> Because it make the message untranslatable.  It is now very simple but
> seems enough.
>
> While working on this, I noticed that we need to set EndOfWAL when
> WaitForWALToBecomeAvailable returned with failure.  That means the
> file does not exist at all so it is a kind of end-of-WAL.  In that
> sense the following existing comment in ReadRecord is a bit wrong.
>
> >* We only end up here without a message when XLogPageRead()
> >* failed - in that case we already logged something. In
> >* StandbyMode that only happens if we have been triggered, so we
> >* shouldn't loop anymore in that case.
>
> Actually there's a case we get there without a message and without
> logged something when a segment file is not found unless we're in
> standby mode.
>
> > > Well. I guess that the "automatic recovery" is ambiguous.  Does it
> > > make sense if the second line were like the follows instead?
> > >
> > > + 2022-02-08 18:00:44.368 IST [86185] LOG:  database system was not 
> > > properly shut down; crash recovery in progress
> > >
> >
> > Well, according to me the current message looks fine.
>
> Good to hear. (In the previos version I modified the message by accident..)
>
> > $chkptfile is declared twice in the same scope. We can probably remove
> > the first one.
>
> Ugh.. Fixed.  (I wonder why Perl doesn't complain on this..)
>
>
> In this version 12 I made the following changes.
>
> - Rewrote (halfly reverted) a comment in ReadRecord
>
> - Simplified the "reached end of WAL" message by removing recovery
>   mode and WAL source in ReadRecord.
>
> - XLogPageRead sets EndOfWAL flag in the ENOENT case.
>
> - Removed redundant declaration of the same variable in TAP script.
>

Thanks for the changes. Please note that I am not able to apply the
latest patch on HEAD. Could you please rebase it on HEAD and share the
new version. Thank you.

--
With Regards,
Ashutosh Sharma.

Re: Make mesage at end-of-recovery less scary.

2022-02-09 Thread Kyotaro Horiguchi

At Wed, 9 Feb 2022 17:31:02 +0530, Ashutosh Sharma  
wrote in 
> On Wed, Feb 9, 2022 at 1:14 PM Kyotaro Horiguchi
>  wrote:
> > This means archive-recovery is requested but not started yet. That is,
> > we've just finished crash recovery.  The existing comment cited
> > together is mentioning that.
> >
> > At the end of PITR (or archive recovery), the other code works.
> >
> 
> This is quite understandable, the point here is that the message that
> we are emitting says, we have just finished reading the wal files in
> the pg_wal directory during crash recovery and are now entering
> archive recovery when we are actually doing point-in-time recovery
> which seems a bit misleading.

Here is the messages.

> 2022-02-08 18:00:44.367 IST [86185] LOG:  starting point-in-time
> recovery to WAL location (LSN) "0/5227790"
> 2022-02-08 18:00:44.368 IST [86185] LOG:  database system was not
> properly shut down; automatic recovery in progress
> 2022-02-08 18:00:44.369 IST [86185] LOG:  redo starts at 0/14DC8D8
> 2022-02-08 18:00:44.978 IST [86185] DEBUG1:  reached end of WAL at
> 0/3D0 on timeline 1 in pg_wal during crash recovery, entering
> archive recovery

In the first place the last DEBUG1 is not on my part, but one of the
messages added by this patch says the same thing.  Is your point that
archive recovery is different thing from PITR?  In regard to the
difference, I think PITR is a form of archive recovery.

That being said, after some thoughts on this, I changed my mind that
we don't need to say what operation was being performed at the
end-of-WAL.  So in the attached the end-of-WAL message is not
accompanied by the kind of recovery.

> LOG:  reached end of WAL at 0/300 on timeline 1

I removed the archive-source part along with the operation mode.
Because it make the message untranslatable.  It is now very simple but
seems enough.

While working on this, I noticed that we need to set EndOfWAL when
WaitForWALToBecomeAvailable returned with failure.  That means the
file does not exist at all so it is a kind of end-of-WAL.  In that
sense the following existing comment in ReadRecord is a bit wrong.

>* We only end up here without a message when XLogPageRead()
>* failed - in that case we already logged something. In
>* StandbyMode that only happens if we have been triggered, so we
>* shouldn't loop anymore in that case.

Actually there's a case we get there without a message and without
logged something when a segment file is not found unless we're in
standby mode.

> > Well. I guess that the "automatic recovery" is ambiguous.  Does it
> > make sense if the second line were like the follows instead?
> >
> > + 2022-02-08 18:00:44.368 IST [86185] LOG:  database system was not 
> > properly shut down; crash recovery in progress
> >
> 
> Well, according to me the current message looks fine.

Good to hear. (In the previos version I modified the message by accident..)

> $chkptfile is declared twice in the same scope. We can probably remove
> the first one.

Ugh.. Fixed.  (I wonder why Perl doesn't complain on this..)

In this version 12 I made the following changes.

- Rewrote (halfly reverted) a comment in ReadRecord

- Simplified the "reached end of WAL" message by removing recovery
  mode and WAL source in ReadRecord.

- XLogPageRead sets EndOfWAL flag in the ENOENT case.

- Removed redundant declaration of the same variable in TAP script.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From e553164dbca709389d92b05cf8ae7a8b427e83a6 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 28 Feb 2020 15:52:58 +0900
Subject: [PATCH v12] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlog.c |  92 +-
 src/backend/access/transam/xlogreader.c   |  78 
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 ++-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 108 +-
 6 files changed, 268 insertions(+), 31 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 958220c495..bf1d40e7cb 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4480,6 +4480,7 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 	for (;;)
 	{
 		char	   *errormsg;
+		XLogRecPtr	ErrRecPtr = InvalidXLogRecPtr;

 		record = XLogReadRecord(xlogreader, );
 		if (record == NULL)
@@ -4495,6 +4496,18 @@ ReadRecord(XLogReaderState *xlogreader, int emode,

Re: Make mesage at end-of-recovery less scary.

2022-02-09 Thread Ashutosh Sharma

On Wed, Feb 9, 2022 at 1:14 PM Kyotaro Horiguchi
 wrote:
>
> Hi, Ashutosh.
>
> At Tue, 8 Feb 2022 18:35:34 +0530, Ashutosh Sharma  
> wrote in
> > Here are some of my review comments on the v11 patch:
>
> Thank you for taking a look on this.
>
> > -   (errmsg_internal("reached end of WAL in
> > pg_wal, entering archive recovery")));
> > +   (errmsg_internal("reached end of WAL at %X/%X
> > on timeline %u in %s during crash recovery, entering archive
> > recovery",
> > +LSN_FORMAT_ARGS(ErrRecPtr),
> > +replayTLI,
> > +xlogSourceNames[currentSource])));
> >
> > Why crash recovery? Won't this message get printed even during PITR?
>
> It is in the if-block with the following condition.
>
> >* If archive recovery was requested, but we were still doing
> >* crash recovery, switch to archive recovery and retry using the
> >* offline archive. We have now replayed all the valid WAL in
> >* pg_wal, so we are presumably now consistent.
> ...
> >if (!InArchiveRecovery && ArchiveRecoveryRequested)
>
> This means archive-recovery is requested but not started yet. That is,
> we've just finished crash recovery.  The existing comment cited
> together is mentioning that.
>
> At the end of PITR (or archive recovery), the other code works.
>

This is quite understandable, the point here is that the message that
we are emitting says, we have just finished reading the wal files in
the pg_wal directory during crash recovery and are now entering
archive recovery when we are actually doing point-in-time recovery
which seems a bit misleading.

> > /*
> >  * If we haven't emit an error message, we have safely reached the
> >  * end-of-WAL.
> >  */
> > if (emode_for_corrupt_record(LOG, ErrRecPtr) == LOG)
> > {
> >   char   *fmt;
> >
> >   if (StandbyMode)
> >   fmt = gettext_noop("reached end of WAL at %X/%X on timeline 
> > %u in %s during standby mode");
> >   else if (InArchiveRecovery)
> >   fmt = gettext_noop("reached end of WAL at %X/%X on timeline 
> > %u in %s during archive recovery");
> >   else
> >   fmt = gettext_noop("reached end of WAL at %X/%X on timeline 
> > %u in %s during crash recovery");
>
> The last among the above messages is choosed when archive-recovery is
> not requested at all.
>
> > I just did a PITR and could see these messages in the logfile.
>
> Yeah, the log lines are describing that the server starting with crash
> recovery to run PITR.
>
> > 2022-02-08 18:00:44.367 IST [86185] LOG:  starting point-in-time
> > recovery to WAL location (LSN) "0/5227790"
> > 2022-02-08 18:00:44.368 IST [86185] LOG:  database system was not
> > properly shut down; automatic recovery in progress
>
> Well. I guess that the "automatic recovery" is ambiguous.  Does it
> make sense if the second line were like the follows instead?
>
> + 2022-02-08 18:00:44.368 IST [86185] LOG:  database system was not properly 
> shut down; crash recovery in progress
>

Well, according to me the current message looks fine.

> > Lastly, are we also planning to backport this patch?
>
> This is apparent a behavioral change, not a bug fix, which I think we
> regard as not appropriate for back-patching.
>
>
> As the result, I made the following chages in the version 11.
>
> 1. Changed the condition for the "end-of-WAL" message from
>emode_for_corrupt_record to the EndOfWAL flag.
>
> 2. Corrected the wording of end-of-wal to end-of-WAL.
>
> 3. In the "reached end of WAL" message, report the LSN of the
>   beginning of failed record instead of the beginning of the
>   last-succeeded record.
>
> 4. In the changed message in walreceiver.c, I swapped LSN and timeline
>   so that they are in the same order with other similar messages.
>

Thanks for sharing this information.

==

Here is one more comment:

One more comment:

+# identify REDO WAL file
+my $cmd = "pg_controldata -D " . $node->data_dir();
+my $chkptfile;
+$cmd = ['pg_controldata', '-D', $node->data_dir()];
+$stdout = '';
+$stderr = '';
+IPC::Run::run $cmd, '>', \$stdout, '2>', \$stderr;
+ok($stdout =~ /^Latest checkpoint's REDO WAL file:[ \t] *(.+)$/m,
+   "checkpoint file is identified");
+my $chkptfile = $1;

$chkptfile is declared twice in the same scope. We can probably remove
the first one.

--
With Regards,
Ashutosh Sharma.

Re: Make mesage at end-of-recovery less scary.

2022-02-08 Thread Kyotaro Horiguchi

Hi, Ashutosh.

At Tue, 8 Feb 2022 18:35:34 +0530, Ashutosh Sharma  
wrote in 
> Here are some of my review comments on the v11 patch:

Thank you for taking a look on this.

> -   (errmsg_internal("reached end of WAL in
> pg_wal, entering archive recovery")));
> +   (errmsg_internal("reached end of WAL at %X/%X
> on timeline %u in %s during crash recovery, entering archive
> recovery",
> +LSN_FORMAT_ARGS(ErrRecPtr),
> +replayTLI,
> +xlogSourceNames[currentSource])));
> 
> Why crash recovery? Won't this message get printed even during PITR?

It is in the if-block with the following condition.

>* If archive recovery was requested, but we were still doing
>* crash recovery, switch to archive recovery and retry using the
>* offline archive. We have now replayed all the valid WAL in
>* pg_wal, so we are presumably now consistent.
...
>if (!InArchiveRecovery && ArchiveRecoveryRequested)

This means archive-recovery is requested but not started yet. That is,
we've just finished crash recovery.  The existing comment cited
together is mentioning that.

At the end of PITR (or archive recovery), the other code works.

> /*
>  * If we haven't emit an error message, we have safely reached the
>  * end-of-WAL.
>  */
> if (emode_for_corrupt_record(LOG, ErrRecPtr) == LOG)
> {
>   char   *fmt;
> 
>   if (StandbyMode)
>   fmt = gettext_noop("reached end of WAL at %X/%X on timeline %u 
> in %s during standby mode");
>   else if (InArchiveRecovery)
>   fmt = gettext_noop("reached end of WAL at %X/%X on timeline %u 
> in %s during archive recovery");
>   else
>   fmt = gettext_noop("reached end of WAL at %X/%X on timeline %u 
> in %s during crash recovery");

The last among the above messages is choosed when archive-recovery is
not requested at all.

> I just did a PITR and could see these messages in the logfile.

Yeah, the log lines are describing that the server starting with crash
recovery to run PITR.

> 2022-02-08 18:00:44.367 IST [86185] LOG:  starting point-in-time
> recovery to WAL location (LSN) "0/5227790"
> 2022-02-08 18:00:44.368 IST [86185] LOG:  database system was not
> properly shut down; automatic recovery in progress

Well. I guess that the "automatic recovery" is ambiguous.  Does it
make sense if the second line were like the follows instead?

+ 2022-02-08 18:00:44.368 IST [86185] LOG:  database system was not properly 
shut down; crash recovery in progress

> 2022-02-08 18:00:44.369 IST [86185] LOG:  redo starts at 0/14DC8D8
> 2022-02-08 18:00:44.978 IST [86185] DEBUG1:  reached end of WAL at
> 0/3D0 on timeline 1 in pg_wal during crash recovery, entering
> archive recovery

(I don't include this change in this patch since it would be another
issue.)

> ==
> 
> +   /*
> +* If we haven't emit an error message, we have safely reached the
> +* end-of-WAL.
> +*/
> +   if (emode_for_corrupt_record(LOG, ErrRecPtr) == LOG)
> +   {
> +   char   *fmt;
> +
> +   if (StandbyMode)
> +   fmt = gettext_noop("reached end of WAL at %X/%X on
> timeline %u in %s during standby mode");
> +   else if (InArchiveRecovery)
> +   fmt = gettext_noop("reached end of WAL at %X/%X on
> timeline %u in %s during archive recovery");
> +   else
> +   fmt = gettext_noop("reached end of WAL at %X/%X on
> timeline %u in %s during crash recovery");
> +
> +   ereport(LOG,
> +   (errmsg(fmt, LSN_FORMAT_ARGS(ErrRecPtr), replayTLI,
> +   xlogSourceNames[currentSource]),
> +(errormsg ? errdetail_internal("%s", errormsg) : 
> 0)));
> +   }
> 
> Doesn't it make sense to add an assert statement inside this if-block
> that will check for xlogreader->EndOfWAL?

Good point.  On second thought, the condition there is flat wrong.
The message is "reached end of WAL" so the condition should be
EndOfWAL.  On the other hand we didn't make sure that the error
message for the stop is emitted anywhere.  Thus I don't particularly
want to be strict on that point.

I made the following change for this.

-   if (emode_for_corrupt_record(LOG, ErrRecPtr) == LOG)
+   if (xlogreader->EndOfWAL)



> ==
> 
> -* We only end up here without a message when XLogPageRead()
> -* failed - in that case we already logged something. In
> -* StandbyMode that only happens if we have been triggered, so we
> -* shouldn't loop anymore in that case.
> +* If we get here for other than end-of-wal, emit the error
> +* message right now. Otherwise the message if any is shown as a
>

Re: Make mesage at end-of-recovery less scary.

2022-02-08 Thread Ashutosh Sharma

Hi,

Here are some of my review comments on the v11 patch:

-   (errmsg_internal("reached end of WAL in
pg_wal, entering archive recovery")));
+   (errmsg_internal("reached end of WAL at %X/%X
on timeline %u in %s during crash recovery, entering archive
recovery",
+LSN_FORMAT_ARGS(ErrRecPtr),
+replayTLI,
+xlogSourceNames[currentSource])));

Why crash recovery? Won't this message get printed even during PITR?

I just did a PITR and could see these messages in the logfile.

2022-02-08 18:00:44.367 IST [86185] LOG:  starting point-in-time
recovery to WAL location (LSN) "0/5227790"
2022-02-08 18:00:44.368 IST [86185] LOG:  database system was not
properly shut down; automatic recovery in progress
2022-02-08 18:00:44.369 IST [86185] LOG:  redo starts at 0/14DC8D8
2022-02-08 18:00:44.978 IST [86185] DEBUG1:  reached end of WAL at
0/3D0 on timeline 1 in pg_wal during crash recovery, entering
archive recovery

==

+   /*
+* If we haven't emit an error message, we have safely reached the
+* end-of-WAL.
+*/
+   if (emode_for_corrupt_record(LOG, ErrRecPtr) == LOG)
+   {
+   char   *fmt;
+
+   if (StandbyMode)
+   fmt = gettext_noop("reached end of WAL at %X/%X on
timeline %u in %s during standby mode");
+   else if (InArchiveRecovery)
+   fmt = gettext_noop("reached end of WAL at %X/%X on
timeline %u in %s during archive recovery");
+   else
+   fmt = gettext_noop("reached end of WAL at %X/%X on
timeline %u in %s during crash recovery");
+
+   ereport(LOG,
+   (errmsg(fmt, LSN_FORMAT_ARGS(ErrRecPtr), replayTLI,
+   xlogSourceNames[currentSource]),
+(errormsg ? errdetail_internal("%s", errormsg) : 0)));
+   }

Doesn't it make sense to add an assert statement inside this if-block
that will check for xlogreader->EndOfWAL?

==

-* We only end up here without a message when XLogPageRead()
-* failed - in that case we already logged something. In
-* StandbyMode that only happens if we have been triggered, so we
-* shouldn't loop anymore in that case.
+* If we get here for other than end-of-wal, emit the error
+* message right now. Otherwise the message if any is shown as a
+* part of the end-of-WAL message below.
 */

For consistency, I think we can replace "end-of-wal" with
"end-of-WAL". Please note that everywhere else in the comments you
have used "end-of-WAL". So why not the same here?

==

ereport(LOG,
-   (errmsg("replication terminated by
primary server"),
-errdetail("End of WAL reached on
timeline %u at %X/%X.",
-  startpointTLI,
-
LSN_FORMAT_ARGS(LogstreamResult.Write;
+   (errmsg("replication terminated by
primary server on timeline %u at %X/%X.",
+   startpointTLI,
+
LSN_FORMAT_ARGS(LogstreamResult.Write;

Is this change really required? I don't see any issue with the
existing error message.

==

Lastly, are we also planning to backport this patch?

--
With Regards,
Ashutosh Sharma.

On Wed, Feb 2, 2022 at 11:05 AM Kyotaro Horiguchi
 wrote:
>
> At Tue, 1 Feb 2022 12:38:01 +0400, Pavel Borisov  
> wrote in
> > Maybe it can be written little bit shorter:
> > pe = (char *) record + XLOG_BLCKSZ - (RecPtr & (XLOG_BLCKSZ - 1));
> > as
> > pe = p + XLOG_BLCKSZ - (RecPtr & (XLOG_BLCKSZ - 1));
> > ?
>
> That difference would be a matter of taste, but I found it looks
> cleaner that definition and assignment is separated for both p and pe.
> Now it is like the following.
>
> >   char   *p;
> >   char   *pe;
> >
> >   /* scan from the beginning of the record to the end of block */
> >   p  = (char *) record;
> >   pe = p + XLOG_BLCKSZ - (RecPtr & (XLOG_BLCKSZ - 1));
>
>
> > The problem that pgindent sometimes reflow formatting of unrelated blocks
> > is indeed existing. But I think it's right to manually leave pgindent-ed
> > code only on what is related to the patch. The leftover is pgindent-ed in a
> > scheduled manner sometimes, so don't need to bother.
>
> Yeah, I meant that it is a bit annoying to unpginden-ting unrelated
> edits:p
>
> > I'd like to set v10 as RfC.
>
> Thanks!  The suggested change is done in the attached v11.
>
> regards.
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center

Re: Make mesage at end-of-recovery less scary.

2022-02-01 Thread Pavel Borisov

>
> Thanks!  The suggested change is done in the attached v11.
>

Thanks! v11 is a small refactoring of v10 that doesn't change behavior, so
it is RfC as well.

-- 
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: Make mesage at end-of-recovery less scary.

2022-02-01 Thread Kyotaro Horiguchi

At Tue, 1 Feb 2022 12:38:01 +0400, Pavel Borisov  wrote 
in 
> Maybe it can be written little bit shorter:
> pe = (char *) record + XLOG_BLCKSZ - (RecPtr & (XLOG_BLCKSZ - 1));
> as
> pe = p + XLOG_BLCKSZ - (RecPtr & (XLOG_BLCKSZ - 1));
> ?

That difference would be a matter of taste, but I found it looks
cleaner that definition and assignment is separated for both p and pe.
Now it is like the following.

>   char   *p;
>   char   *pe;
>
>   /* scan from the beginning of the record to the end of block */
>   p  = (char *) record;
>   pe = p + XLOG_BLCKSZ - (RecPtr & (XLOG_BLCKSZ - 1));


> The problem that pgindent sometimes reflow formatting of unrelated blocks
> is indeed existing. But I think it's right to manually leave pgindent-ed
> code only on what is related to the patch. The leftover is pgindent-ed in a
> scheduled manner sometimes, so don't need to bother.

Yeah, I meant that it is a bit annoying to unpginden-ting unrelated
edits:p

> I'd like to set v10 as RfC.

Thanks!  The suggested change is done in the attached v11.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 491416866920f8f9648dee9c0571022f71553879 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 28 Feb 2020 15:52:58 +0900
Subject: [PATCH v11] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlog.c |  91 +-
 src/backend/access/transam/xlogreader.c   |  78 +++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 ++-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 110 +-
 6 files changed, 270 insertions(+), 30 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index dfe2a0bcce..378c13ccf7 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4480,6 +4480,7 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 	for (;;)
 	{
 		char	   *errormsg;
+		XLogRecPtr	ErrRecPtr = InvalidXLogRecPtr;
 
 		record = XLogReadRecord(xlogreader, );
 		if (record == NULL)
@@ -4495,6 +4496,18 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			{
 abortedRecPtr = xlogreader->abortedRecPtr;
 missingContrecPtr = xlogreader->missingContrecPtr;
+ErrRecPtr = abortedRecPtr;
+			}
+			else
+			{
+/*
+ * NULL ReadRecPtr means we could not read a record at the
+ * beginning. In that case EndRecPtr is storing the LSN of the
+ * record we tried to read.
+ */
+ErrRecPtr =
+	xlogreader->ReadRecPtr ?
+	xlogreader->ReadRecPtr : xlogreader->EndRecPtr;
 			}
 
 			if (readFile >= 0)
@@ -4504,13 +4517,12 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			}
 
 			/*
-			 * We only end up here without a message when XLogPageRead()
-			 * failed - in that case we already logged something. In
-			 * StandbyMode that only happens if we have been triggered, so we
-			 * shouldn't loop anymore in that case.
+			 * If we get here for other than end-of-wal, emit the error
+			 * message right now. Otherwise the message if any is shown as a
+			 * part of the end-of-WAL message below.
 			 */
-			if (errormsg)
-ereport(emode_for_corrupt_record(emode, xlogreader->EndRecPtr),
+			if (!xlogreader->EndOfWAL && errormsg)
+ereport(emode_for_corrupt_record(emode, ErrRecPtr),
 		(errmsg_internal("%s", errormsg) /* already translated */ ));
 		}
 
@@ -4541,11 +4553,12 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			/* Great, got a record */
 			return record;
 		}
-		else
+
+		/* No valid record available from this source */
+		lastSourceFailed = true;
+
+		if (!fetching_ckpt)
 		{
-			/* No valid record available from this source */
-			lastSourceFailed = true;
-
 			/*
 			 * If archive recovery was requested, but we were still doing
 			 * crash recovery, switch to archive recovery and retry using the
@@ -4558,11 +4571,17 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			 * we'd have no idea how far we'd have to replay to reach
 			 * consistency.  So err on the safe side and give up.
 			 */
-			if (!InArchiveRecovery && ArchiveRecoveryRequested &&
-!fetching_ckpt)
+			if (!InArchiveRecovery && ArchiveRecoveryRequested)
 			{
+/*
+ * We don't report this as LOG, since we don't stop recovery
+ * here
+ */
 ereport(DEBUG1,
-		(errmsg_internal("reached end of WAL in pg_wal, entering archive recovery")));
+		(errmsg_internal("reached end of WAL at %X/%X on timeline %u in %s during

Re: Make mesage at end-of-recovery less scary.

2022-02-01 Thread Pavel Borisov

>
> This version 10 is changed in the following points.
>
> - Rewrited the comment in ValidXLogRecordHeader.
> - ValidXLogRecordHeader
>
Thanks!

Maybe it can be written little bit shorter:
pe = (char *) record + XLOG_BLCKSZ - (RecPtr & (XLOG_BLCKSZ - 1));
as
pe = p + XLOG_BLCKSZ - (RecPtr & (XLOG_BLCKSZ - 1));
?


The problem that pgindent sometimes reflow formatting of unrelated blocks
is indeed existing. But I think it's right to manually leave pgindent-ed
code only on what is related to the patch. The leftover is pgindent-ed in a
scheduled manner sometimes, so don't need to bother.

I'd like to set v10 as RfC.

-- 
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: Make mesage at end-of-recovery less scary.

2022-01-31 Thread Kyotaro Horiguchi

Hi, Pavel.

At Mon, 31 Jan 2022 15:17:09 +0400, Pavel Borisov  
wrote in 
> I don't quite understand a meaning of a comment:
>  /* it is completely zeroed, call it a day  */

While rethinking about this comment, It came to my mind that
XLogReaderValidatePageHeader is doing whole-page check.  There is no
clear reason for not doing at least the same check here.
ValidXLogRecordHeader is changed to check all bytes in the rest of the
page, instead of just the record header.

While working on that, I noticed another end-of-WAL case, unexpected
pageaddr.  I think we can assume it safe when the pageaddr is smaller
than expected (or we have no choice than assuming
so). XLogReaderValidatePageHeader is changed that way. But I'm not
sure others regard it as a form of safe end-of-WAL.

> Please also run pgindent on your code.

Hmm. I'm not sure we need to do that at this stage. pgindent makes
changes on the whole file involving unrelated part from this patch.
Anyway I did that then removed irrelevant edits.

pgindent makes a seemingly not-great suggestion.

+   char   *pe =
+   (char *) record + XLOG_BLCKSZ - (RecPtr & (XLOG_BLCKSZ - 1));

I'm not sure this is intended but I split the line into two lines to
define and assign.

> Otherwise the new patch seems ok.

Thanks!

This version 10 is changed in the following points.

- Rewrited the comment in ValidXLogRecordHeader.
- ValidXLogRecordHeader

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 9eacdd050a8041b358df11ca3e18c1071b693d20 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 28 Feb 2020 15:52:58 +0900
Subject: [PATCH v10] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlog.c |  91 +-
 src/backend/access/transam/xlogreader.c   |  77 +++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 ++-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 110 +-
 6 files changed, 269 insertions(+), 30 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index dfe2a0bcce..378c13ccf7 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4480,6 +4480,7 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 	for (;;)
 	{
 		char	   *errormsg;
+		XLogRecPtr	ErrRecPtr = InvalidXLogRecPtr;
 
 		record = XLogReadRecord(xlogreader, );
 		if (record == NULL)
@@ -4495,6 +4496,18 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			{
 abortedRecPtr = xlogreader->abortedRecPtr;
 missingContrecPtr = xlogreader->missingContrecPtr;
+ErrRecPtr = abortedRecPtr;
+			}
+			else
+			{
+/*
+ * NULL ReadRecPtr means we could not read a record at the
+ * beginning. In that case EndRecPtr is storing the LSN of the
+ * record we tried to read.
+ */
+ErrRecPtr =
+	xlogreader->ReadRecPtr ?
+	xlogreader->ReadRecPtr : xlogreader->EndRecPtr;
 			}
 
 			if (readFile >= 0)
@@ -4504,13 +4517,12 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			}
 
 			/*
-			 * We only end up here without a message when XLogPageRead()
-			 * failed - in that case we already logged something. In
-			 * StandbyMode that only happens if we have been triggered, so we
-			 * shouldn't loop anymore in that case.
+			 * If we get here for other than end-of-wal, emit the error
+			 * message right now. Otherwise the message if any is shown as a
+			 * part of the end-of-WAL message below.
 			 */
-			if (errormsg)
-ereport(emode_for_corrupt_record(emode, xlogreader->EndRecPtr),
+			if (!xlogreader->EndOfWAL && errormsg)
+ereport(emode_for_corrupt_record(emode, ErrRecPtr),
 		(errmsg_internal("%s", errormsg) /* already translated */ ));
 		}
 
@@ -4541,11 +4553,12 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			/* Great, got a record */
 			return record;
 		}
-		else
+
+		/* No valid record available from this source */
+		lastSourceFailed = true;
+
+		if (!fetching_ckpt)
 		{
-			/* No valid record available from this source */
-			lastSourceFailed = true;
-
 			/*
 			 * If archive recovery was requested, but we were still doing
 			 * crash recovery, switch to archive recovery and retry using the
@@ -4558,11 +4571,17 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			 * we'd have no idea how far we'd have to replay to reach
 			 * consistency.  So err on the safe side and give up.
 			 */
-			if (!InArchiveRecovery && ArchiveRecoveryRequested &&
-

Re: Make mesage at end-of-recovery less scary.

2022-01-31 Thread Pavel Borisov

>
> > This v8 is changed in...
> >
> > - Added tests to 011_crash_recovery.pl
> >
> > - Fixed a bug that server emits "end-of-wal" messages even if it have
> >   emitted an error message for the same LSN.
> >
> > - Changed XLogReaderValidatePageHeader() so that it recognizes an
> >   empty page as end-of-WAL.
> >
> > - Made pg_waldump conscious of end-of-wal.
> >
> > While doing the last item, I noticed that pg_waldump shows the wrong
> > LSN as the error position.  Concretely it emits the LSN of the last
> > sound WAL record as the error position.  I will post a bug-fix patch
> > for the issue after confirmation.
>
> I noticed that I added a useless error message "garbage record
> header", but it is a kind of invalid record length.  So I removed the
> message. That change makes the logic for EOW in ValidXLogRecordHeader
> and XLogReaderValidatePageHeader share the same flow.
>

Hi,  Kyotaro!

I don't quite understand a meaning of a comment:
 /* it is completely zeroed, call it a day  */

Please also run pgindent on your code.

Otherwise the new patch seems ok.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: Make mesage at end-of-recovery less scary.

2022-01-26 Thread Kyotaro Horiguchi

At Tue, 25 Jan 2022 17:34:56 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> This v8 is changed in...
> 
> - Added tests to 011_crash_recovery.pl
> 
> - Fixed a bug that server emits "end-of-wal" messages even if it have
>   emitted an error message for the same LSN.
> 
> - Changed XLogReaderValidatePageHeader() so that it recognizes an
>   empty page as end-of-WAL.
> 
> - Made pg_waldump conscious of end-of-wal.
> 
> While doing the last item, I noticed that pg_waldump shows the wrong
> LSN as the error position.  Concretely it emits the LSN of the last
> sound WAL record as the error position.  I will post a bug-fix patch
> for the issue after confirmation.

I noticed that I added a useless error message "garbage record
header", but it is a kind of invalid record length.  So I removed the
message. That change makes the logic for EOW in ValidXLogRecordHeader
and XLogReaderValidatePageHeader share the same flow.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 57cb251f7cacbb96066ead4543b9f12f5b3c7062 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 28 Feb 2020 15:52:58 +0900
Subject: [PATCH v9] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlog.c |  91 +-
 src/backend/access/transam/xlogreader.c   |  61 
 src/backend/replication/walreceiver.c |   3 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 ++-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 110 +-
 6 files changed, 251 insertions(+), 28 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index dfe2a0bcce..5727e0939f 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4480,6 +4480,7 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 	for (;;)
 	{
 		char	   *errormsg;
+		XLogRecPtr	ErrRecPtr = InvalidXLogRecPtr;
 
 		record = XLogReadRecord(xlogreader, );
 		if (record == NULL)
@@ -4495,6 +4496,18 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			{
 abortedRecPtr = xlogreader->abortedRecPtr;
 missingContrecPtr = xlogreader->missingContrecPtr;
+ErrRecPtr = abortedRecPtr;
+			}
+			else
+			{
+/*
+ * NULL ReadRecPtr means we could not read a record at the
+ * beginning. In that case EndRecPtr is storing the LSN of the
+ * record we tried to read.
+ */
+ErrRecPtr =
+	xlogreader->ReadRecPtr ?
+	xlogreader->ReadRecPtr : xlogreader->EndRecPtr;
 			}
 
 			if (readFile >= 0)
@@ -4504,13 +4517,12 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			}
 
 			/*
-			 * We only end up here without a message when XLogPageRead()
-			 * failed - in that case we already logged something. In
-			 * StandbyMode that only happens if we have been triggered, so we
-			 * shouldn't loop anymore in that case.
+			 * If we get here for other than end-of-wal, emit the error message
+			 * right now. Otherwise the message if any is shown as a part of
+			 * the end-of-WAL message below.
 			 */
-			if (errormsg)
-ereport(emode_for_corrupt_record(emode, xlogreader->EndRecPtr),
+			if (!xlogreader->EndOfWAL && errormsg)
+ereport(emode_for_corrupt_record(emode, ErrRecPtr),
 		(errmsg_internal("%s", errormsg) /* already translated */ ));
 		}
 
@@ -4541,11 +4553,12 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			/* Great, got a record */
 			return record;
 		}
-		else
+
+		/* No valid record available from this source */
+		lastSourceFailed = true;
+
+		if (!fetching_ckpt)
 		{
-			/* No valid record available from this source */
-			lastSourceFailed = true;
-
 			/*
 			 * If archive recovery was requested, but we were still doing
 			 * crash recovery, switch to archive recovery and retry using the
@@ -4558,11 +4571,17 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			 * we'd have no idea how far we'd have to replay to reach
 			 * consistency.  So err on the safe side and give up.
 			 */
-			if (!InArchiveRecovery && ArchiveRecoveryRequested &&
-!fetching_ckpt)
+			if (!InArchiveRecovery && ArchiveRecoveryRequested)
 			{
+/*
+ * We don't report this as LOG, since we don't stop recovery
+ * here
+ */
 ereport(DEBUG1,
-		(errmsg_internal("reached end of WAL in pg_wal, entering archive recovery")));
+		(errmsg_internal("reached end of WAL at %X/%X on timeline %u in %s during crash recovery, entering archive recovery",
+		 LSN_FORMAT_ARGS(ErrRecPtr),
+		 replayTLI,
+		 xlogSourceNames[currentSource])));

Re: Make mesage at end-of-recovery less scary.

2022-01-25 Thread Kyotaro Horiguchi

At Mon, 24 Jan 2022 14:23:33 +0400, Pavel Borisov  
wrote in 
> >
> > d2ddfa681db27a138acb63c8defa8cc6fa588922 removed global variables
> > ReadRecPtr and EndRecPtr. This is rebased version that reads the LSNs
> > directly from xlogreader instead of the removed global variables.
> >
> 
> Hi, hackers!
> 
> I've checked the latest version of a patch. It applies cleanly, check-world
> passes and CI is also in the green state.
> Proposed messages seem good to me, but probably it would be better to have
> a test on conditions where "reached end of WAL..." emitted.
> Then, I believe it can be set as 'ready for committter'.

Thanks for checking that, and the comment!

I thought that we usually don't test log messages, but finally I found
that I needed that.  It is because I found another mode of end-of-wal
and a bug that emits a spurious message on passing...

This v8 is changed in...

- Added tests to 011_crash_recovery.pl

- Fixed a bug that server emits "end-of-wal" messages even if it have
  emitted an error message for the same LSN.

- Changed XLogReaderValidatePageHeader() so that it recognizes an
  empty page as end-of-WAL.

- Made pg_waldump conscious of end-of-wal.

While doing the last item, I noticed that pg_waldump shows the wrong
LSN as the error position.  Concretely it emits the LSN of the last
sound WAL record as the error position.  I will post a bug-fix patch
for the issue after confirmation.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 0f1024bdfba9d1926465351fa1b7125698a21e8d Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 28 Feb 2020 15:52:58 +0900
Subject: [PATCH v8] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlog.c |  91 +-
 src/backend/access/transam/xlogreader.c   |  64 +
 src/backend/replication/walreceiver.c |   3 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 ++-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 110 +-
 6 files changed, 254 insertions(+), 28 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 58922f7ede..c08b9554b3 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4480,6 +4480,7 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 	for (;;)
 	{
 		char	   *errormsg;
+		XLogRecPtr	ErrRecPtr = InvalidXLogRecPtr;
 
 		record = XLogReadRecord(xlogreader, );
 		if (record == NULL)
@@ -4495,6 +4496,18 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			{
 abortedRecPtr = xlogreader->abortedRecPtr;
 missingContrecPtr = xlogreader->missingContrecPtr;
+ErrRecPtr = abortedRecPtr;
+			}
+			else
+			{
+/*
+ * NULL ReadRecPtr means we could not read a record at the
+ * beginning. In that case EndRecPtr is storing the LSN of the
+ * record we tried to read.
+ */
+ErrRecPtr =
+	xlogreader->ReadRecPtr ?
+	xlogreader->ReadRecPtr : xlogreader->EndRecPtr;
 			}
 
 			if (readFile >= 0)
@@ -4504,13 +4517,12 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			}
 
 			/*
-			 * We only end up here without a message when XLogPageRead()
-			 * failed - in that case we already logged something. In
-			 * StandbyMode that only happens if we have been triggered, so we
-			 * shouldn't loop anymore in that case.
+			 * If we get here for other than end-of-wal, emit the error message
+			 * right now. Otherwise the message if any is shown as a part of
+			 * the end-of-WAL message below.
 			 */
-			if (errormsg)
-ereport(emode_for_corrupt_record(emode, xlogreader->EndRecPtr),
+			if (!xlogreader->EndOfWAL && errormsg)
+ereport(emode_for_corrupt_record(emode, ErrRecPtr),
 		(errmsg_internal("%s", errormsg) /* already translated */ ));
 		}
 
@@ -4541,11 +4553,12 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			/* Great, got a record */
 			return record;
 		}
-		else
+
+		/* No valid record available from this source */
+		lastSourceFailed = true;
+
+		if (!fetching_ckpt)
 		{
-			/* No valid record available from this source */
-			lastSourceFailed = true;
-
 			/*
 			 * If archive recovery was requested, but we were still doing
 			 * crash recovery, switch to archive recovery and retry using the
@@ -4558,11 +4571,17 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			 * we'd have no idea how far we'd have to replay to reach
 			 * consistency.  So err on the safe side and give up.
 			 */
-			if (!InArchiveRecovery && ArchiveRecoveryRequested &&
-

Re: Make mesage at end-of-recovery less scary.

2022-01-24 Thread Pavel Borisov

>
> d2ddfa681db27a138acb63c8defa8cc6fa588922 removed global variables
> ReadRecPtr and EndRecPtr. This is rebased version that reads the LSNs
> directly from xlogreader instead of the removed global variables.
>

Hi, hackers!

I've checked the latest version of a patch. It applies cleanly, check-world
passes and CI is also in the green state.
Proposed messages seem good to me, but probably it would be better to have
a test on conditions where "reached end of WAL..." emitted.
Then, I believe it can be set as 'ready for committter'.

-- 
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: Make mesage at end-of-recovery less scary.

2021-12-07 Thread Kyotaro Horiguchi

At Tue, 09 Nov 2021 16:27:51 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> This is the updated version.
> 
> - emode_for_currupt_record() now uses currentSource instead of
>   readSource.
> 
> - If zero record length is faced, make sure the whole header is zeroed
>   before deciding it as the end-of-WAL.
> 
> - Do not overwrite existig message when missing contrecord is
>   detected.  The message added here is seen in the TAP test log
>   026_overwrite_contrecord_standby.log

d2ddfa681db27a138acb63c8defa8cc6fa588922 removed global variables
ReadRecPtr and EndRecPtr. This is rebased version that reads the LSNs
directly from xlogreader instead of the removed global variables.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From cc521692a9f98fabde07e248b63f1222f8406de1 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 28 Feb 2020 15:52:58 +0900
Subject: [PATCH v7] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlog.c   | 89 +++--
 src/backend/access/transam/xlogreader.c | 42 
 src/backend/replication/walreceiver.c   |  3 +-
 src/include/access/xlogreader.h |  1 +
 4 files changed, 112 insertions(+), 23 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index d894af310a..fa435faec4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4469,6 +4469,7 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 	for (;;)
 	{
 		char	   *errormsg;
+		XLogRecPtr	ErrRecPtr = InvalidXLogRecPtr;
 
 		record = XLogReadRecord(xlogreader, );
 		if (record == NULL)
@@ -4484,6 +4485,18 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			{
 abortedRecPtr = xlogreader->abortedRecPtr;
 missingContrecPtr = xlogreader->missingContrecPtr;
+ErrRecPtr = abortedRecPtr;
+			}
+			else
+			{
+/*
+ * NULL ReadRecPtr means we could not read a record at
+ * beginning. In that case EndRecPtr is storing the LSN of the
+ * record we tried to read.
+ */
+ErrRecPtr =
+	xlogreader->ReadRecPtr ?
+	xlogreader->ReadRecPtr : xlogreader->EndRecPtr;
 			}
 
 			if (readFile >= 0)
@@ -4493,12 +4506,11 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			}
 
 			/*
-			 * We only end up here without a message when XLogPageRead()
-			 * failed - in that case we already logged something. In
-			 * StandbyMode that only happens if we have been triggered, so we
-			 * shouldn't loop anymore in that case.
+			 * If we get here for other than end-of-wal, emit the error message
+			 * right now. Otherwise the message if any is shown as a part of
+			 * the end-of-WAL message below.
 			 */
-			if (errormsg)
+			if (!xlogreader->EndOfWAL && errormsg)
 ereport(emode_for_corrupt_record(emode, xlogreader->EndRecPtr),
 		(errmsg_internal("%s", errormsg) /* already translated */ ));
 		}
@@ -4530,11 +4542,12 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			/* Great, got a record */
 			return record;
 		}
-		else
+
+		/* No valid record available from this source */
+		lastSourceFailed = true;
+
+		if (!fetching_ckpt)
 		{
-			/* No valid record available from this source */
-			lastSourceFailed = true;
-
 			/*
 			 * If archive recovery was requested, but we were still doing
 			 * crash recovery, switch to archive recovery and retry using the
@@ -4547,11 +4560,17 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			 * we'd have no idea how far we'd have to replay to reach
 			 * consistency.  So err on the safe side and give up.
 			 */
-			if (!InArchiveRecovery && ArchiveRecoveryRequested &&
-!fetching_ckpt)
+			if (!InArchiveRecovery && ArchiveRecoveryRequested)
 			{
+/*
+ * We don't report this as LOG, since we don't stop recovery
+ * here
+ */
 ereport(DEBUG1,
-		(errmsg_internal("reached end of WAL in pg_wal, entering archive recovery")));
+		(errmsg_internal("reached end of WAL at %X/%X on timeline %u in %s during crash recovery, entering archive recovery",
+		 LSN_FORMAT_ARGS(ErrRecPtr),
+		 replayTLI,
+		 xlogSourceNames[currentSource])));
 InArchiveRecovery = true;
 if (StandbyModeRequested)
 	StandbyMode = true;
@@ -4599,12 +4618,33 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 continue;
 			}
 
-			/* In standby mode, loop back to retry. Otherwise, give up. */
-			if (StandbyMode && !CheckForStandbyTrigger())
-continue;
-			else
-return NULL;
+			/*
+			 *  If we haven't emit an error message, we have safely reached the
+			 *

Re: Make mesage at end-of-recovery less scary.

2021-11-08 Thread Kyotaro Horiguchi

Thank you for the comments!

At Tue, 9 Nov 2021 09:53:15 +0900, Michael Paquier  wrote 
in 
> On Mon, Nov 08, 2021 at 02:59:46PM +0900, Kyotaro Horiguchi wrote:
> > While checking the behavior for the case of missing-contrecord, I
> > noticed that emode_for_corrupt_record() doesn't work as expected since
> > readSource is reset to XLOG_FROM_ANY after a read failure. We could
> > remember the last failed source but pg_wal should have been visited if
> > page read error happened so I changed the function so that it treats
> > XLOG_FROM_ANY the same way with XLOG_FROM_PG_WAL.
> 
> FWIW, I am not much a fan of assuming that it is fine to use
> XLOG_FROM_ANY as a condition here.  The comments on top of
> emode_for_corrupt_record() make it rather clear what the expectations
> are, and this is the default readSource.

The readSource is expected by the function to be the failed source but
it goes back to XLOG_FROM_ANY on page read failure.  So the function
*is* standing on the wrong assumption.  I noticed that currentSource
holds the last accessed source (but forgot about that). So it is
exactly what we need here.  No longer need to introduce the unclear
assumption by using it.

> > (Otherwise we see "LOG: reached end-of-WAL at .." message after
> >  "WARNING: missing contrecord at.." message.)
> 
> +  /* broken record found */
> +  ereport(WARNING,
> +  (errmsg("redo is skipped"),
> +   errhint("This suggests WAL file corruption. You might 
> need to check the database.")));
> This looks rather scary to me, FWIW, and this could easily be reached

Yes, the message is intentionally scary, since we don't come here in
the case of clean WAL:)

> if one forgets about EndOfWAL while hacking on xlogreader.c.
> Unlikely so, still.

I don't understand.  Isn't it the case of almost every feature?

The patch compells hackers to maintain the condition for recovery
being considered cleanly ended.  If the last record doesn't meet the
condition, the WAL file should be considered having a
problem. However, I don't see the condition expanded to have another
term in future.

Even if someone including myself broke that condition, we will at
worst unwantedly see a scary message.  And I believe almost all
hackers can easily find it a bug from the DETAILED message shown along
aside.  I'm not sure such bugs could be found in development phase,
though..

> +   report_invalid_record(state,
> + "missing contrecord at %X/%X",
> + LSN_FORMAT_ARGS(RecPtr));
> Isn't there a risk here to break applications checking after error
> messages stored in the WAL reader after seeing a contrecord?

I'm not sure you are mentioning the case where no message is stored
previously, or the case where already a message is stored. The former
is fine as the record is actually broken. But I was missing the latter
case.  In this version I avoided to overwite the error message.

> +   if (record->xl_tot_len == 0)
> +   {
> +   /* This is strictly not an invalid state, so phrase it as so. */
> +   report_invalid_record(state,
> + "record length is 0 at %X/%X",
> + LSN_FORMAT_ARGS(RecPtr));
> +   state->EndOfWAL = true;
> +   return false;
> +   }
> This assumes that a value of 0 for xl_tot_len is a synonym of the end
> of WAL, but cannot we have also a corrupted record in this case in the
> shape of xl_tot_len being 0?  We validate the full record after
> reading the header, so it seems to me that we should not assume that
> things are just ending as proposed in this patch.

Yeah, it's the most serious concern to me. So I didn't hide the
detailed message in the "end-of-wal reached message".

> LOG:  reached end of WAL at 0/512F198 on timeline 1 in pg_wal during crash 
> recovery
> DETAIL:  record length is 0 at 0/512F210

I believe everyone regards zero record length as fine unless something
wrong is seen afterwards.  However, we can extend the check to the
whole record header. I think it is by far nearer to the perfect for
almost all cases. The attached emits the following message for the
good (true end-of-WAL) case.

> LOG:  reached end of WAL at 0/512F4A0 on timeline 1 in pg_wal during crash 
> recovery
> DETAIL:  empty record header found at 0/512F518

If garbage bytes are found in the header area, the following log will
be left. I think we can have a better message here.

> WARNING:  garbage record header at 0/2095458
> LOG:  redo done at 0/2095430 system usage: CPU: user: 0.03 s, system: 0.01 s, 
> elapsed: 0.04 s

This is the updated version.

- emode_for_currupt_record() now uses currentSource instead of
  readSource.

- If zero record length is faced, make sure the whole header is zeroed
  before deciding it as the end-of-WAL.

- Do not overwrite existig message when missing contrecord is
  detected.  The message added here is seen in the TAP test log

Re: Make mesage at end-of-recovery less scary.

2021-11-08 Thread Michael Paquier

On Mon, Nov 08, 2021 at 02:59:46PM +0900, Kyotaro Horiguchi wrote:
> While checking the behavior for the case of missing-contrecord, I
> noticed that emode_for_corrupt_record() doesn't work as expected since
> readSource is reset to XLOG_FROM_ANY after a read failure. We could
> remember the last failed source but pg_wal should have been visited if
> page read error happened so I changed the function so that it treats
> XLOG_FROM_ANY the same way with XLOG_FROM_PG_WAL.

FWIW, I am not much a fan of assuming that it is fine to use
XLOG_FROM_ANY as a condition here.  The comments on top of
emode_for_corrupt_record() make it rather clear what the expectations
are, and this is the default readSource.

> (Otherwise we see "LOG: reached end-of-WAL at .." message after
>  "WARNING: missing contrecord at.." message.)

+  /* broken record found */
+  ereport(WARNING,
+  (errmsg("redo is skipped"),
+   errhint("This suggests WAL file corruption. You might 
need to check the database.")));
This looks rather scary to me, FWIW, and this could easily be reached
if one forgets about EndOfWAL while hacking on xlogreader.c.
Unlikely so, still.

+   report_invalid_record(state,
+ "missing contrecord at %X/%X",
+ LSN_FORMAT_ARGS(RecPtr));
Isn't there a risk here to break applications checking after error
messages stored in the WAL reader after seeing a contrecord?

+   if (record->xl_tot_len == 0)
+   {
+   /* This is strictly not an invalid state, so phrase it as so. */
+   report_invalid_record(state,
+ "record length is 0 at %X/%X",
+ LSN_FORMAT_ARGS(RecPtr));
+   state->EndOfWAL = true;
+   return false;
+   }
This assumes that a value of 0 for xl_tot_len is a synonym of the end
of WAL, but cannot we have also a corrupted record in this case in the
shape of xl_tot_len being 0?  We validate the full record after
reading the header, so it seems to me that we should not assume that
things are just ending as proposed in this patch.
--
Michael


signature.asc
Description: PGP signature

Re: Make mesage at end-of-recovery less scary.

2021-11-07 Thread Kyotaro Horiguchi

At Fri, 22 Oct 2021 17:54:40 +, "Bossart, Nathan"  
wrote in 
> On 3/4/21, 10:50 PM, "Kyotaro Horiguchi"  wrote:
> > As the result, the following messages are emitted with the attached.
> 
> I'd like to voice my support for this effort, and I intend to help
> review the patch.  It looks like the latest patch no longer applies,
> so I've marked the commitfest entry [0] as waiting-on-author.
> 
> Nathan
> 
> [0] https://commitfest.postgresql.org/35/2490/

Sorry for being late to reply.  I rebased this to the current master.

- rebased

- use LSN_FORMAT_ARGS instead of bare shift and mask.

- v4 immediately exited walreceiver on disconnection. Maybe I wanted
  not to see a FATAL message on standby after primary dies. However
  that would be another issue and that change was plain wrong..  v5
  just removes the "end-of-WAL" part from the message, which duplicate
  to what startup emits.

- add a new error message "missing contrecord at %X/%X".  Maybe this
  should be regarded as a leftover of the contrecord patch. In the
  attached patch the "%X/%X" is the LSN of the current record. The log
  messages look like this (026_overwrite_contrecord).

LOG:  redo starts at 0/1486CB8
WARNING:  missing contrecord at 0/1FFC2E0
LOG:  consistent recovery state reached at 0/1FFC2E0
LOG:  started streaming WAL from primary at 0/200 on timeline 1
LOG:  successfully skipped missing contrecord at 0/1FFC2E0, overwritten at 
2021-11-08 14:50:11.969952+09
CONTEXT:  WAL redo at 0/228 for XLOG/OVERWRITE_CONTRECORD: lsn 0/1FFC2E0; 
time 2021-11-08 14:50:11.969952+09

While checking the behavior for the case of missing-contrecord, I
noticed that emode_for_corrupt_record() doesn't work as expected since
readSource is reset to XLOG_FROM_ANY after a read failure. We could
remember the last failed source but pg_wal should have been visited if
page read error happened so I changed the function so that it treats
XLOG_FROM_ANY the same way with XLOG_FROM_PG_WAL.

(Otherwise we see "LOG: reached end-of-WAL at .." message after
 "WARNING: missing contrecord at.." message.)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 276f59c8b37a31cb831b7753d2b107eb1d83c1fb Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 28 Feb 2020 15:52:58 +0900
Subject: [PATCH v5] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlog.c   | 94 +++--
 src/backend/access/transam/xlogreader.c | 14 
 src/backend/replication/walreceiver.c   |  3 +-
 src/include/access/xlogreader.h |  1 +
 4 files changed, 87 insertions(+), 25 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 5cda30836f..623fb01d0a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4477,6 +4477,7 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 	for (;;)
 	{
 		char	   *errormsg;
+		XLogRecPtr	ErrRecPtr = InvalidXLogRecPtr;
 
 		record = XLogReadRecord(xlogreader, );
 		ReadRecPtr = xlogreader->ReadRecPtr;
@@ -4494,6 +4495,16 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			{
 abortedRecPtr = xlogreader->abortedRecPtr;
 missingContrecPtr = xlogreader->missingContrecPtr;
+ErrRecPtr = abortedRecPtr;
+			}
+			else
+			{
+/*
+ * NULL ReadRecPtr means we could not read a record at
+ * beginning. In that case EndRecPtr is storing the LSN of the
+ * record we tried to read.
+ */
+ErrRecPtr = ReadRecPtr ? ReadRecPtr : EndRecPtr;
 			}
 
 			if (readFile >= 0)
@@ -4503,13 +4514,12 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			}
 
 			/*
-			 * We only end up here without a message when XLogPageRead()
-			 * failed - in that case we already logged something. In
-			 * StandbyMode that only happens if we have been triggered, so we
-			 * shouldn't loop anymore in that case.
+			 * If we get here for other than end-of-wal, emit the error message
+			 * right now. Otherwise the message if any is shown as a part of
+			 * the end-of-WAL message below.
 			 */
-			if (errormsg)
-ereport(emode_for_corrupt_record(emode, EndRecPtr),
+			if (!xlogreader->EndOfWAL && errormsg)
+ereport(emode_for_corrupt_record(emode, ErrRecPtr),
 		(errmsg_internal("%s", errormsg) /* already translated */ ));
 		}
 
@@ -4540,11 +4550,12 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			/* Great, got a record */
 			return record;
 		}
-		else
+
+		/* No valid record available from this source */
+		lastSourceFailed = true;
+
+		if (!fetching_ckpt)
 		{
-			/* No valid record available from

Re: Make mesage at end-of-recovery less scary.

2021-10-22 Thread Bossart, Nathan

On 3/4/21, 10:50 PM, "Kyotaro Horiguchi"  wrote:
> As the result, the following messages are emitted with the attached.

I'd like to voice my support for this effort, and I intend to help
review the patch.  It looks like the latest patch no longer applies,
so I've marked the commitfest entry [0] as waiting-on-author.

Nathan

[0] https://commitfest.postgresql.org/35/2490/

Re: Make mesage at end-of-recovery less scary.

2021-03-03 Thread Kyotaro Horiguchi

At Wed, 3 Mar 2021 11:14:20 -0500, David Steele  wrote in 
> Hi Kyotaro,
> 
> On 3/27/20 10:25 PM, James Coleman wrote:
> > On Thu, Mar 26, 2020 at 12:41 PM Robert Haas 
> > wrote:
> >>
> >> I'm just spitballing here, but it would be really good if there's a
> >> way to know definitely whether or not you should be scared. Corrupted
> >> WAL segments are definitely a thing that happens, but retries are a
> >> lot more common.
> > First, I agree that getting enough context to say precisely is by far
> > the ideal.
> > That being said, as an end user who's found this surprising -- and
> > momentarily scary every time I initially scan it even though I *know
> > intellectually it's not* -- I would find Peter's suggestion a
> > significant improvement over what we have now. I'm fairly certainly my
> > co-workers on our database team would also. Knowing that something is
> > at least not always scary is good. Though I'll grant that this does
> > have the negative in reverse: if it actually is a scary
> > situation...this mutes your concern level. On the other hand,
> > monitoring would tell us if there's a real problem (namely replication
> > lag), so I think the trade-off is clearly worth it.
> > How about this minor tweak:
> > HINT:  This is expected if this is the end of currently available WAL.
> > Otherwise, it could indicate corruption.
> 
> Any thoughts on the suggestions for making the messaging clearer?
> 
> Also, the patch no longer applies:
> http://cfbot.cputube.org/patch_32_2490.log.

Sorry for missing the last discussions.  I agree to the point about
really-scary situation.

ValidXLogRecordHeader deliberately marks End-Of-WAL only in the case
of zero-length record so that the callers can identify that case,
instead of inferring the EOW state without it. All other invalid data
is treated as potentially danger situation.  I think this is a
reasonable classification. And the error level for the "danger" cases
is changed to WARNING (from LOG).


As the result, the following messages are emitted with the attached.

- found zero-length record during recovery (the DETAIL might not be needed.)

> LOG:  redo starts at 0/14000118
> LOG:  reached end of WAL at 0/14C5D070 on timeline 1 in pg_wal during crash 
> recovery
> DETAIL:  record length is 0 at 0/14C5D070
> LOG:  redo done at 0/14C5CF48 system usage: ...

- found another kind of invalid data

> LOG:  redo starts at 0/15A0
> WARNING:  invalid record length at 0/1500CA60: wanted 24, got 54
> LOG:  redo done at 0/1500CA28 system usage: ...


On the way checking the patch, I found that it emits the following log
lines in the case the redo loop meets an invalid record at the
starting:

> LOG:  invalid record length at 0/1118: wanted 24, got 42
> LOG:  redo is not required

which doesn't look proper.  That case is identifiable using the
End-of_WAL flag this patch adds. Thus we get the following error
messages.


- found end-of-wal at the beginning of recovery

> LOG:  reached end of WAL at 0/13A0 on timeline 1 in pg_wal during crash 
> recovery
> DETAIL:  record length is 0 at 0/13A0
> LOG:  redo is not required

- found invalid data

> WARNING:  invalid record length at 0/12A0: wanted 24, got 42
> WARNING:  redo is skipped
> HINT:  This suggests WAL file corruption. You might need to check the 
> database.

The logic of ErrRecPtr in ReadRecord may wrong.  I remember having
such an discussion before...

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From f89b5c965d0a49de3c1297bb5edd2dc061951b71 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 28 Feb 2020 15:52:58 +0900
Subject: [PATCH v4] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is happening.
Make this message less scary as "reached end of WAL".
---
 src/backend/access/transam/xlog.c   | 81 ++---
 src/backend/access/transam/xlogreader.c | 11 
 src/backend/replication/walreceiver.c   | 13 ++--
 src/include/access/xlogreader.h |  1 +
 4 files changed, 79 insertions(+), 27 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 377afb8732..fbcb8d78b8 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4361,12 +4361,15 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 	for (;;)
 	{
 		char	   *errormsg;
+		XLogRecPtr	ErrRecPtr = InvalidXLogRecPtr;
 
 		record = XLogReadRecord(xlogreader, );
 		ReadRecPtr = xlogreader->ReadRecPtr;
 		EndRecPtr = xlogreader->EndRecPtr;
 		if (record == NULL)
 		{
+			ErrRecPtr = ReadRecPtr ? ReadRecPtr : EndRecPtr;
+
 			if (readFile >= 0)
 			{
 close(readFile);
@@ -4374,13 +4377,12 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			}
 
 			/*
-			 * We only end up here without a message when XLogPageRead()
-			 * failed - in that case we already logged something. In
-			 *

Re: Make mesage at end-of-recovery less scary.

2021-03-03 Thread David Steele


Hi Kyotaro,

On 3/27/20 10:25 PM, James Coleman wrote:

On Thu, Mar 26, 2020 at 12:41 PM Robert Haas  wrote:


I'm just spitballing here, but it would be really good if there's a
way to know definitely whether or not you should be scared. Corrupted
WAL segments are definitely a thing that happens, but retries are a
lot more common.


First, I agree that getting enough context to say precisely is by far the ideal.

That being said, as an end user who's found this surprising -- and
momentarily scary every time I initially scan it even though I *know
intellectually it's not* -- I would find Peter's suggestion a
significant improvement over what we have now. I'm fairly certainly my
co-workers on our database team would also. Knowing that something is
at least not always scary is good. Though I'll grant that this does
have the negative in reverse: if it actually is a scary
situation...this mutes your concern level. On the other hand,
monitoring would tell us if there's a real problem (namely replication
lag), so I think the trade-off is clearly worth it.

How about this minor tweak:
HINT:  This is expected if this is the end of currently available WAL.
Otherwise, it could indicate corruption.


Any thoughts on the suggestions for making the messaging clearer?

Also, the patch no longer applies: 
http://cfbot.cputube.org/patch_32_2490.log.


Marking this Waiting on Author.

Regards,
--
-David
da...@pgmasters.net

Re: Make mesage at end-of-recovery less scary.

2020-03-27 Thread James Coleman

On Thu, Mar 26, 2020 at 12:41 PM Robert Haas  wrote:
>
> On Wed, Mar 25, 2020 at 8:53 AM Peter Eisentraut
>  wrote:
> > HINT:  This is to be expected if this is the end of the WAL.  Otherwise,
> > it could indicate corruption.
>
> First, I agree that this general issue is a problem, because it's come
> up for me in quite a number of customer situations. Either people get
> scared when they shouldn't, because the message is innocuous, or they
> don't get scared about other things that actually are scary, because
> if some scary-looking messages are actually innocuous, it can lead
> people to believe that the same is true in other cases.
>
> Second, I don't really like the particular formulation you have above,
> because the user still doesn't know whether or not to be scared. Can
> we figure that out? I think if we're in crash recovery, I think that
> we should not be scared, because we have no alternative to assuming
> that we've reached the end of WAL, so all crash recoveries will end
> like this. If we're in archive recovery, we should definitely be
> scared if we haven't yet reached the minimum recovery point, because
> more WAL than that should certainly exist. After that, it depends on
> how we got the WAL. If it's being streamed, the question is whether
> we've reached the end of what got streamed. If it's being copied from
> the archive, we ought to have the whole segment, but maybe not more.
> Can we get the right context to the point where the error is being
> reported to know whether we hit the error at the end of the WAL that
> was streamed? If not, can we somehow rejigger things so that we only
> make it sound scary if we keep getting stuck at the same point when we
> woud've expected to make progress meanwhile?
>
> I'm just spitballing here, but it would be really good if there's a
> way to know definitely whether or not you should be scared. Corrupted
> WAL segments are definitely a thing that happens, but retries are a
> lot more common.

First, I agree that getting enough context to say precisely is by far the ideal.

That being said, as an end user who's found this surprising -- and
momentarily scary every time I initially scan it even though I *know
intellectually it's not* -- I would find Peter's suggestion a
significant improvement over what we have now. I'm fairly certainly my
co-workers on our database team would also. Knowing that something is
at least not always scary is good. Though I'll grant that this does
have the negative in reverse: if it actually is a scary
situation...this mutes your concern level. On the other hand,
monitoring would tell us if there's a real problem (namely replication
lag), so I think the trade-off is clearly worth it.

How about this minor tweak:
HINT:  This is expected if this is the end of currently available WAL.
Otherwise, it could indicate corruption.

Thanks,
James

Re: Make mesage at end-of-recovery less scary.

2020-03-26 Thread Robert Haas

On Wed, Mar 25, 2020 at 8:53 AM Peter Eisentraut
 wrote:
> HINT:  This is to be expected if this is the end of the WAL.  Otherwise,
> it could indicate corruption.

First, I agree that this general issue is a problem, because it's come
up for me in quite a number of customer situations. Either people get
scared when they shouldn't, because the message is innocuous, or they
don't get scared about other things that actually are scary, because
if some scary-looking messages are actually innocuous, it can lead
people to believe that the same is true in other cases.

Second, I don't really like the particular formulation you have above,
because the user still doesn't know whether or not to be scared. Can
we figure that out? I think if we're in crash recovery, I think that
we should not be scared, because we have no alternative to assuming
that we've reached the end of WAL, so all crash recoveries will end
like this. If we're in archive recovery, we should definitely be
scared if we haven't yet reached the minimum recovery point, because
more WAL than that should certainly exist. After that, it depends on
how we got the WAL. If it's being streamed, the question is whether
we've reached the end of what got streamed. If it's being copied from
the archive, we ought to have the whole segment, but maybe not more.
Can we get the right context to the point where the error is being
reported to know whether we hit the error at the end of the WAL that
was streamed? If not, can we somehow rejigger things so that we only
make it sound scary if we keep getting stuck at the same point when we
woud've expected to make progress meanwhile?

I'm just spitballing here, but it would be really good if there's a
way to know definitely whether or not you should be scared. Corrupted
WAL segments are definitely a thing that happens, but retries are a
lot more common.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Make mesage at end-of-recovery less scary.

2020-03-25 Thread Peter Eisentraut


On 2020-03-24 02:52, Kyotaro Horiguchi wrote:

I don't find that very satisfying, but I can't come up with something
that provides the current information, while being less scary than my
suggestion?

The 0-length record is not an "invalid" state during recovery, so we
can add the message for the state as "record length is 0 at %X/%X". I
think if other states found there, it implies something wrong.

LSN is redundantly shown but I'm not sure if it is better to remove it
from either of the two lines.

| LOG:  reached end of WAL at 0/3000850 on timeline 1 in pg_wal during crash 
recovery
| DETAIL:  record length is 0 at 0/3000850


I'm not up to date on all these details, but my high-level idea would be 
some kind of hint associated with the existing error messages, like:


HINT:  This is to be expected if this is the end of the WAL.  Otherwise, 
it could indicate corruption.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Make mesage at end-of-recovery less scary.

2020-03-23 Thread Kyotaro Horiguchi

At Mon, 23 Mar 2020 12:47:36 -0700, Andres Freund  wrote in 
> Hi,
> 
> On 2020-03-23 10:37:16 +0100, Peter Eisentraut wrote:
> > On 2020-03-05 08:06, Kyotaro Horiguchi wrote:
> > > | [20866] LOG:  replication terminated by primary server
> > > | [20866] DETAIL:  End of WAL reached on timeline 1 at 0/30001C8.
> > > | [20866] FATAL:  could not send end-of-streaming message to primary: no 
> > > COPY in progress
> 
> IMO it's a bug that we see this FATAL. I seem to recall that we didn't
> use to get that?

I thought that it is a convention that A auxiliary process uses ERROR
(which is turned into FATAL in ereport) to exit, which I didn't like
so much, but it was out of scope of this patch.

As for the message bove, the FATAL is preceded by the "LOG:
replication terminated by" message, that means walreceiver tries to
send new data after disconnection just to fail, which is
unreasonable. I think we should exit immediately after detecting
disconnection. The FATAL is gone by the attached.

> > > | [24267]  LOG:  reached end of WAL at 0/3000240 on timeline 1 in archive 
> > > during standby mode
> > > | [24267]  DETAIL:  invalid record length at 0/3000240: wanted 24, got 0
> > 
> > Is this the before and after?  That doesn't seem like a substantial
> > improvement to me.  You still get the "scary" message at the end.
> 
> It seems like a minor improvement - folding the DETAIL into the message
> makes sense to me here. But it indeed doesn't really address the main
> issue.
> 
> I think we don't want to elide the information about how the end of the
> WAL was detected - there are some issues where I found that quite
> helpful. But we could reformulate it to be clearer that it's informative
> output, not a bug.  E.g. something roughly like
> 
> LOG:  reached end of WAL at 0/3000240 on timeline 1 in archive during standby 
> mode
> DETAIL: End detected due to invalid record length at 0/3000240: expected 24, 
> got 0
> (I first elided the position in the DETAIL, but it could differ from the
> one in LOG)
> 
> I don't find that very satisfying, but I can't come up with something
> that provides the current information, while being less scary than my
> suggestion?

The 0-length record is not an "invalid" state during recovery, so we
can add the message for the state as "record length is 0 at %X/%X". I
think if other states found there, it implies something wrong.

LSN is redundantly shown but I'm not sure if it is better to remove it
from either of the two lines.

| LOG:  reached end of WAL at 0/3000850 on timeline 1 in pg_wal during crash 
recovery
| DETAIL:  record length is 0 at 0/3000850

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 47511afed5f8acf92abaf1cd6fcfecc1faea9c87 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 28 Feb 2020 15:52:58 +0900
Subject: [PATCH v3] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is happening.
Make this message less scary as "reached end of WAL".
---
 src/backend/access/transam/xlog.c   | 69 ++---
 src/backend/access/transam/xlogreader.c |  9 
 src/backend/replication/walreceiver.c   | 11 ++--
 3 files changed, 67 insertions(+), 22 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 793c076da6..6c2924dfb7 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4283,12 +4283,15 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 	for (;;)
 	{
 		char	   *errormsg;
+		XLogRecPtr	ErrRecPtr = InvalidXLogRecPtr;
 
 		record = XLogReadRecord(xlogreader, );
 		ReadRecPtr = xlogreader->ReadRecPtr;
 		EndRecPtr = xlogreader->EndRecPtr;
 		if (record == NULL)
 		{
+			ErrRecPtr = ReadRecPtr ? ReadRecPtr : EndRecPtr;
+
 			if (readFile >= 0)
 			{
 close(readFile);
@@ -4296,13 +4299,13 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			}
 
 			/*
-			 * We only end up here without a message when XLogPageRead()
-			 * failed - in that case we already logged something. In
-			 * StandbyMode that only happens if we have been triggered, so we
-			 * shouldn't loop anymore in that case.
+			 * If we are fetching checkpoint, we emit the error message right
+			 * now. Otherwise the error is regarded as "end of WAL" and the
+			 * message if any is shown as a part of the end-of-WAL message
+			 * below.
 			 */
-			if (errormsg)
-ereport(emode_for_corrupt_record(emode, EndRecPtr),
+			if (fetching_ckpt && errormsg)
+ereport(emode_for_corrupt_record(emode, ErrRecPtr),
 		(errmsg_internal("%s", errormsg) /* already translated */ ));
 		}
 
@@ -4333,11 +4336,12 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			/* Great, got a record */
 			return record;
 		}
-		else
-		{
-			/* No valid record available from this source */
-			lastSourceFailed = true;
 
+		/* No valid record available from this source */

Re: Make mesage at end-of-recovery less scary.

2020-03-23 Thread Andres Freund

Hi,

On 2020-03-23 10:43:09 -0700, Ashwin Agrawal wrote:
> Plus, I am hoping message will improve for pg_waldump as well?
> Since it reads confusing and every-time have to explain new developer it's
> expected behavior which is annoying.
> 
> pg_waldump: fatal: error in WAL record at 0/1553F70: invalid record length
> at 0/1553FA8: wanted 24, got 0

What would you like to see here? There's inherently a lot less
information about the context in waldump. We can't know whether it's to
be expected that the WAL ends at that point, or whether there was
corruption.

Greetings,

Andres Freund

Re: Make mesage at end-of-recovery less scary.

2020-03-23 Thread Andres Freund

Hi,

On 2020-03-23 10:37:16 +0100, Peter Eisentraut wrote:
> On 2020-03-05 08:06, Kyotaro Horiguchi wrote:
> > | [20866] LOG:  replication terminated by primary server
> > | [20866] DETAIL:  End of WAL reached on timeline 1 at 0/30001C8.
> > | [20866] FATAL:  could not send end-of-streaming message to primary: no 
> > COPY in progress

IMO it's a bug that we see this FATAL. I seem to recall that we didn't
use to get that?

> > | [20851] LOG:  reached end of WAL at 0/30001C8 on timeline 1 in archive 
> > during standby mode
> > | [20851] DETAIL:  invalid record length at 0/30001C8: wanted 24, got 0
> > 
> > I changed the above to the below, which looks more adequate.
> > 
> > | [24271]  LOG:  replication terminated by primary server on timeline 1 at 
> > 0/3000240.
> > | [24271]  FATAL:  could not send end-of-streaming message to primary: no 
> > COPY in progress
> > | [24267]  LOG:  reached end of WAL at 0/3000240 on timeline 1 in archive 
> > during standby mode
> > | [24267]  DETAIL:  invalid record length at 0/3000240: wanted 24, got 0
> 
> Is this the before and after?  That doesn't seem like a substantial
> improvement to me.  You still get the "scary" message at the end.

It seems like a minor improvement - folding the DETAIL into the message
makes sense to me here. But it indeed doesn't really address the main
issue.

I think we don't want to elide the information about how the end of the
WAL was detected - there are some issues where I found that quite
helpful. But we could reformulate it to be clearer that it's informative
output, not a bug.  E.g. something roughly like

LOG:  reached end of WAL at 0/3000240 on timeline 1 in archive during standby 
mode
DETAIL: End detected due to invalid record length at 0/3000240: expected 24, 
got 0
(I first elided the position in the DETAIL, but it could differ from the
one in LOG)

I don't find that very satisfying, but I can't come up with something
that provides the current information, while being less scary than my
suggestion?

Greetings,

Andres Freund

Re: Make mesage at end-of-recovery less scary.

2020-03-23 Thread Ashwin Agrawal

On Mon, Mar 23, 2020 at 2:37 AM Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 2020-03-05 08:06, Kyotaro Horiguchi wrote:
> > | [20866] LOG:  replication terminated by primary server
> > | [20866] DETAIL:  End of WAL reached on timeline 1 at 0/30001C8.
> > | [20866] FATAL:  could not send end-of-streaming message to primary: no
> COPY in progress
> > | [20851] LOG:  reached end of WAL at 0/30001C8 on timeline 1 in archive
> during standby mode
> > | [20851] DETAIL:  invalid record length at 0/30001C8: wanted 24, got 0
> >
> > I changed the above to the below, which looks more adequate.
> >
> > | [24271]  LOG:  replication terminated by primary server on timeline 1
> at 0/3000240.
> > | [24271]  FATAL:  could not send end-of-streaming message to primary:
> no COPY in progress
> > | [24267]  LOG:  reached end of WAL at 0/3000240 on timeline 1 in
> archive during standby mode
> > | [24267]  DETAIL:  invalid record length at 0/3000240: wanted 24, got 0
>
> Is this the before and after?  That doesn't seem like a substantial
> improvement to me.  You still get the "scary" message at the end.
>

+1 I agree it still reads scary and doesn't seem improvement.

Plus, I am hoping message will improve for pg_waldump as well?
Since it reads confusing and every-time have to explain new developer it's
expected behavior which is annoying.

pg_waldump: fatal: error in WAL record at 0/1553F70: invalid record length
at 0/1553FA8: wanted 24, got 0

Re: Make mesage at end-of-recovery less scary.

2020-03-23 Thread Peter Eisentraut


On 2020-03-05 08:06, Kyotaro Horiguchi wrote:

| [20866] LOG:  replication terminated by primary server
| [20866] DETAIL:  End of WAL reached on timeline 1 at 0/30001C8.
| [20866] FATAL:  could not send end-of-streaming message to primary: no COPY 
in progress
| [20851] LOG:  reached end of WAL at 0/30001C8 on timeline 1 in archive during 
standby mode
| [20851] DETAIL:  invalid record length at 0/30001C8: wanted 24, got 0

I changed the above to the below, which looks more adequate.

| [24271]  LOG:  replication terminated by primary server on timeline 1 at 
0/3000240.
| [24271]  FATAL:  could not send end-of-streaming message to primary: no COPY 
in progress
| [24267]  LOG:  reached end of WAL at 0/3000240 on timeline 1 in archive 
during standby mode
| [24267]  DETAIL:  invalid record length at 0/3000240: wanted 24, got 0


Is this the before and after?  That doesn't seem like a substantial 
improvement to me.  You still get the "scary" message at the end.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Make mesage at end-of-recovery less scary.

2020-03-04 Thread Kyotaro Horiguchi

Hello.

I changed the condition from randAccess to fetching_ckpt considering
the discussion in another thread [1]. Then I moved the block that
shows the new messages to more appropriate place.

At Fri, 28 Feb 2020 17:28:06 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> > 
> > Have you considered an error context here?  Your patch leads to a bit
> > of duplication with the message a bit down of what you are changing
> > where the end of local pg_wal is reached.
> 
> It is a DEBUG message and it is for the time moving from crash
> recovery to archive recovery. I could remove that but decided to leave
> it for tracability.

I modified the message so that it has the same look to the new
messages, but I left it being DEBUG1, since it is just a intermediate
state. We should finally see one of the new three messages.

After the messages changed, another message from wal sender came to
look redundant.

| [20866] LOG:  replication terminated by primary server
| [20866] DETAIL:  End of WAL reached on timeline 1 at 0/30001C8.
| [20866] FATAL:  could not send end-of-streaming message to primary: no COPY 
in progress
| [20851] LOG:  reached end of WAL at 0/30001C8 on timeline 1 in archive during 
standby mode
| [20851] DETAIL:  invalid record length at 0/30001C8: wanted 24, got 0

I changed the above to the below, which looks more adequate.

| [24271]  LOG:  replication terminated by primary server on timeline 1 at 
0/3000240.
| [24271]  FATAL:  could not send end-of-streaming message to primary: no COPY 
in progress
| [24267]  LOG:  reached end of WAL at 0/3000240 on timeline 1 in archive 
during standby mode
| [24267]  DETAIL:  invalid record length at 0/3000240: wanted 24, got 0

> > > + * reached the end of WAL.  Otherwise something's really wrong and
> > > + * we report just only the errormsg if any. If we don't receive
> > 
> > This sentence sounds strange to me.  Or you meant "Something is wrong,
> > so use errormsg as report if it is set"?

The message no longer exists.

> > > + (errmsg ("rached end of WAL at %X/%X on timeline %u in %s during crash 
> > > recovery",
> > 
> > FWIW, you are introducing three times the same typo, in the same
> > word, in three different messages.
> 
> They're copy-pasto.  I refrained from constructing an error message
> from multiple nonindipendent parts.  Are you suggesting to do so?

The tree times repetition of almost same phrases is very unreadable. I
rewrote it in more simple shape.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 17ee82e5d44dd5a932ed69b8a1ea91a23d170952 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 28 Feb 2020 15:52:58 +0900
Subject: [PATCH v2] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is happening.
Make this message less scary as "reached end of WAL".
---
 src/backend/access/transam/xlog.c | 72 ---
 src/backend/replication/walreceiver.c |  3 +-
 2 files changed, 55 insertions(+), 20 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index d19408b3be..849cf6fe6b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4282,12 +4282,15 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 	for (;;)
 	{
 		char	   *errormsg;
+		XLogRecPtr	ErrRecPtr = InvalidXLogRecPtr;
 
 		record = XLogReadRecord(xlogreader, );
 		ReadRecPtr = xlogreader->ReadRecPtr;
 		EndRecPtr = xlogreader->EndRecPtr;
 		if (record == NULL)
 		{
+			ErrRecPtr = ReadRecPtr ? ReadRecPtr : EndRecPtr;
+
 			if (readFile >= 0)
 			{
 close(readFile);
@@ -4295,14 +4298,16 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			}
 
 			/*
-			 * We only end up here without a message when XLogPageRead()
-			 * failed - in that case we already logged something. In
-			 * StandbyMode that only happens if we have been triggered, so we
-			 * shouldn't loop anymore in that case.
+			 * If we are fetching checkpoint, we emit the error message right
+			 * now. Otherwise the error is regarded as "end of WAL" and the
+			 * message if any is shown as a part of the end-of-WAL message
+			 * below.
 			 */
-			if (errormsg)
-ereport(emode_for_corrupt_record(emode, EndRecPtr),
+			if (fetching_ckpt && errormsg)
+			{
+ereport(emode_for_corrupt_record(emode, ErrRecPtr),
 		(errmsg_internal("%s", errormsg) /* already translated */ ));
+			}
 		}
 
 		/*
@@ -4332,11 +4337,12 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			/* Great, got a record */
 			return record;
 		}
-		else
-		{
-			/* No valid record available from this source */
-			lastSourceFailed = true;
 
+		/* No valid record available from this source */
+		lastSourceFailed = true;
+
+		if (!fetching_ckpt)
+		{
 			/*
 			 * If archive recovery was requested, but we were still doing
 			 * crash recovery, switch to archive recovery and retry using the
@@

Re: Make mesage at end-of-recovery less scary.

2020-02-28 Thread Kyotaro Horiguchi

Thank you for the comments.

At Fri, 28 Feb 2020 16:33:18 +0900, Michael Paquier  wrote 
in 
> On Fri, Feb 28, 2020 at 04:01:00PM +0900, Kyotaro Horiguchi wrote:
> > Hello, this is a followup thread of [1].
> > 
> > # I didn't noticed that the thread didn't cover -hackers..
> > 
> > When recovery of any type ends, we see several kinds of error messages
> > that says "WAL is broken".
> 
> Have you considered an error context here?  Your patch leads to a bit
> of duplication with the message a bit down of what you are changing
> where the end of local pg_wal is reached.

It is a DEBUG message and it is for the time moving from crash
recovery to archive recovery. I could remove that but decided to leave
it for tracability.

> > +   * reached the end of WAL.  Otherwise something's really wrong and
> > +   * we report just only the errormsg if any. If we don't receive
> 
> This sentence sounds strange to me.  Or you meant "Something is wrong,
> so use errormsg as report if it is set"?

The whole comment there follows.
| recovery. If we get here during recovery, we can assume that we
| reached the end of WAL.  Otherwise something's really wrong and
| we report just only the errormsg if any. If we don't receive
| errormsg here, we already logged something.  We don't emit
| "reached end of WAL" in muted messages.

"Othhersise" means "other than the case of recovery".  "Just only the
errmsg" means "show the message not as a part the message "reached end
of WAL".

> > +* Note: errormsg is alreay translated.
> 
> Typo here.

Thanks. Will fix along with "rached".

> > +   if (StandbyMode)
> > +   ereport(actual_emode,
> > +   (errmsg ("rached end of WAL at %X/%X on timeline %u in 
> > %s during streaming replication",
> 
> StandbyMode happens also with only WAL archiving, depending on if
> primary_conninfo is set or not.

Right. I'll fix it. Maybe to "during standby mode".

> > +   (errmsg ("rached end of WAL at %X/%X on timeline %u in %s during crash 
> > recovery",
> 
> FWIW, you are introducing three times the same typo, in the same
> word, in three different messages.

They're copy-pasto.  I refrained from constructing an error message
from multiple nonindipendent parts.  Are you suggesting to do so?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Make mesage at end-of-recovery less scary.

2020-02-27 Thread Michael Paquier

On Fri, Feb 28, 2020 at 04:01:00PM +0900, Kyotaro Horiguchi wrote:
> Hello, this is a followup thread of [1].
> 
> # I didn't noticed that the thread didn't cover -hackers..
> 
> When recovery of any type ends, we see several kinds of error messages
> that says "WAL is broken".

Have you considered an error context here?  Your patch leads to a bit
of duplication with the message a bit down of what you are changing
where the end of local pg_wal is reached.

> + * reached the end of WAL.  Otherwise something's really wrong and
> + * we report just only the errormsg if any. If we don't receive

This sentence sounds strange to me.  Or you meant "Something is wrong,
so use errormsg as report if it is set"?

> +  * Note: errormsg is alreay translated.

Typo here.

> + if (StandbyMode)
> + ereport(actual_emode,
> + (errmsg ("rached end of WAL at %X/%X on timeline %u in 
> %s during streaming replication",

StandbyMode happens also with only WAL archiving, depending on if
primary_conninfo is set or not.

> + (errmsg ("rached end of WAL at %X/%X on timeline %u in %s during crash 
> recovery",

FWIW, you are introducing three times the same typo, in the same
word, in three different messages.
--
Michael

signature.asc
Description: PGP signature

Make mesage at end-of-recovery less scary.

2020-02-27 Thread Kyotaro Horiguchi

Hello, this is a followup thread of [1].

# I didn't noticed that the thread didn't cover -hackers..

When recovery of any type ends, we see several kinds of error messages
that says "WAL is broken".

> LOG:  invalid record length at 0/7CB6BC8: wanted 24, got 0
> LOG:  redo is not required
> LOG:  database system is ready to accept connections

This patch reduces the scariness of such messages as the follows.

> LOG:  rached end of WAL at 0/1551048 on timeline 1 in pg_wal during crash 
> recovery
> DETAIL:  invalid record length at 0/1551048: wanted 24, got 0
> LOG:  redo is not required
> LOG:  database system is ready to accept connections

[1] 
https://www.postgresql.org/message-id/20200117.172655.1384889922565817808.horikyota.ntt%40gmail.com

I'll register this to the coming CF.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From f3692cb484b7f1ebc351ba8a522039c0b91bcfdb Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 28 Feb 2020 15:52:58 +0900
Subject: [PATCH] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is happening.
Make this message less scary as "reached end of WAL".
---
 src/backend/access/transam/xlog.c | 45 ++-
 1 file changed, 38 insertions(+), 7 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index d19408b3be..452c376f62 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4288,6 +4288,10 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 		EndRecPtr = xlogreader->EndRecPtr;
 		if (record == NULL)
 		{
+			int actual_emode =
+emode_for_corrupt_record(emode,
+		 ReadRecPtr ? ReadRecPtr : EndRecPtr);
+
 			if (readFile >= 0)
 			{
 close(readFile);
@@ -4295,14 +4299,41 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			}
 
 			/*
-			 * We only end up here without a message when XLogPageRead()
-			 * failed - in that case we already logged something. In
-			 * StandbyMode that only happens if we have been triggered, so we
-			 * shouldn't loop anymore in that case.
+			 * randAccess here means we are reading successive records during
+			 * recovery. If we get here during recovery, we can assume that we
+			 * reached the end of WAL.  Otherwise something's really wrong and
+			 * we report just only the errormsg if any. If we don't receive
+			 * errormsg here, we already logged something.  We don't emit
+			 * "reached end of WAL" in muted messages.
+			 *
+			 * Note: errormsg is alreay translated.
 			 */
-			if (errormsg)
-ereport(emode_for_corrupt_record(emode, EndRecPtr),
-		(errmsg_internal("%s", errormsg) /* already translated */ ));
+			if (!private->randAccess && actual_emode == emode)
+			{
+if (StandbyMode)
+	ereport(actual_emode,
+			(errmsg ("rached end of WAL at %X/%X on timeline %u in %s during streaming replication",
+	 (uint32) (EndRecPtr >> 32), (uint32) EndRecPtr,
+	 ThisTimeLineID,
+	 xlogSourceNames[currentSource]),
+			 (errormsg ? errdetail_internal("%s", errormsg) : 0)));
+else if (InArchiveRecovery)
+	ereport(actual_emode,
+			(errmsg ("rached end of WAL at %X/%X on timeline %u in %s during archive recovery",
+	 (uint32) (EndRecPtr >> 32), (uint32) EndRecPtr,
+	 ThisTimeLineID,
+	 xlogSourceNames[currentSource]),
+			 (errormsg ? errdetail_internal("%s", errormsg) : 0)));
+else
+	ereport(actual_emode,
+			(errmsg ("rached end of WAL at %X/%X on timeline %u in %s during crash recovery",
+	 (uint32) (EndRecPtr >> 32), (uint32) EndRecPtr,
+	 ThisTimeLineID,
+	 xlogSourceNames[currentSource]),
+			 (errormsg ? errdetail_internal("%s", errormsg) : 0)));
+			}
+			else if (errormsg)
+ereport(actual_emode, (errmsg_internal("%s", errormsg)));
 		}
 
 		/*
-- 
2.18.2

69 matches

Mail list logo