Re: [patch] Fix checksum verification in base backups for zero page headers

2020-10-29 Thread Michael Paquier
On Wed, Oct 28, 2020 at 04:43:44PM +0900, Michael Paquier wrote:
> On Wed, Oct 28, 2020 at 04:11:56PM +0900, Michael Paquier wrote:
>> Thanks.  The patch for v13 cannot use a macro, but one of the versions
>> of upthread would do just fine.

For the note, I have posted a set of patches to address all issues on
the original thread where the problem applies:
https://www.postgresql.org/message-id/20201030023028.GC1693%40paquier.xyz
--
Michael


signature.asc
Description: PGP signature


Re: [patch] Fix checksum verification in base backups for zero page headers

2020-10-28 Thread Michael Paquier
On Wed, Oct 28, 2020 at 04:11:56PM +0900, Michael Paquier wrote:
> Thanks.  The patch for v13 cannot use a macro, but one of the versions
> of upthread would do just fine.  I have been wondering about using the
> new CheckBuffer() for the purpose of the retry to make it
> concurrent-safe, but by looking at the code I think that we would run
> unto problems when trying to open through smgr.c any relation file in
> global/ as these require an invalid backend ID, and a WAL sender does
> not satisfy that (see the assertion in GetRelationPath()).

Actually, scratch that..  It looks like I am wrong here.  By using
smgropen() with InvalidBackendId and a RelFileNode built using the
path of the file being sent, similarly to what pg_rewind is doing in
isRelDataFile(), we should have everything that's needed.  This is
too complicated for a backpatch and we should have some consolidation
with pg_rewind, so using the sleep/retry for v13 sounds like a safer
path to take in the stable branch.
--
Michael


signature.asc
Description: PGP signature


Re: [patch] Fix checksum verification in base backups for zero page headers

2020-10-28 Thread Michael Paquier
On Tue, Oct 27, 2020 at 10:56:23PM +0300, Anastasia Lubennikova wrote:
> In case you need a second opinion on the remaining patch, it still looks
> good to me.

Thanks.  The patch for v13 cannot use a macro, but one of the versions
of upthread would do just fine.  I have been wondering about using the
new CheckBuffer() for the purpose of the retry to make it
concurrent-safe, but by looking at the code I think that we would run
unto problems when trying to open through smgr.c any relation file in
global/ as these require an invalid backend ID, and a WAL sender does
not satisfy that (see the assertion in GetRelationPath()).  I have
been hesitating about increasing the number of retries though to give
more room to false positives.  20 perhaps?  That would give 2s to a
disk to finish flushing a page that was caught in the middle of a
check with a sleep of 100ms, which sounds plenty enough.
--
Michael


signature.asc
Description: PGP signature


Re: [patch] Fix checksum verification in base backups for zero page headers

2020-10-27 Thread Anastasia Lubennikova

On 26.10.2020 04:13, Michael Paquier wrote:

On Fri, Oct 23, 2020 at 08:00:08AM +0900, Michael Paquier wrote:

Yeah, we could try to make the logic a bit more complicated like
that.  However, for any code path relying on a page read without any
locking insurance, we cannot really have a lot of trust in any of the
fields assigned to the page as this could just be random corruption
garbage, and the only thing I am ready to trust here a checksum
mismatch check, because that's the only field on the page that's
linked to its full contents on the 8k page.  This also keeps the code
simpler.

A small update here.  I have extracted the refactored part for
PageIsVerified() and committed it as that's independently useful.
This makes the patch proposed here simpler on HEAD, leading to the
attached.
--
Michael


Thank you for committing the first part.

In case you need a second opinion on the remaining patch, it still looks 
good to me.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: [patch] Fix checksum verification in base backups for zero page headers

2020-10-25 Thread Michael Paquier
On Fri, Oct 23, 2020 at 08:00:08AM +0900, Michael Paquier wrote:
> Yeah, we could try to make the logic a bit more complicated like
> that.  However, for any code path relying on a page read without any
> locking insurance, we cannot really have a lot of trust in any of the
> fields assigned to the page as this could just be random corruption
> garbage, and the only thing I am ready to trust here a checksum
> mismatch check, because that's the only field on the page that's
> linked to its full contents on the 8k page.  This also keeps the code
> simpler.

A small update here.  I have extracted the refactored part for
PageIsVerified() and committed it as that's independently useful.
This makes the patch proposed here simpler on HEAD, leading to the
attached.
--
Michael
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index b89df01fa7..37b60437a6 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1569,21 +1569,24 @@ sendFile(const char *readfilename, const char *tarfilename,
 {
 	int			fd;
 	BlockNumber blkno = 0;
-	bool		block_retry = false;
+	int			block_attempts = 0;
 	char		buf[TAR_SEND_SIZE];
-	uint16		checksum;
 	int			checksum_failures = 0;
 	off_t		cnt;
 	int			i;
 	pgoff_t		len = 0;
 	char	   *page;
 	size_t		pad;
-	PageHeader	phdr;
 	int			segmentno = 0;
 	char	   *segmentpath;
 	bool		verify_checksum = false;
 	pg_checksum_context checksum_ctx;
 
+	/* Maximum number of checksum failures reported for this file */
+#define CHECKSUM_REPORT_THRESHOLD		5
+	/* maximum number of retries done for a single page */
+#define PAGE_RETRY_THRESHOLD			5
+
 	pg_checksum_init(_ctx, manifest->checksum_type);
 
 	fd = OpenTransientFile(readfilename, O_RDONLY | PG_BINARY);
@@ -1661,9 +1664,7 @@ sendFile(const char *readfilename, const char *tarfilename,
 		if (verify_checksum && (cnt % BLCKSZ != 0))
 		{
 			ereport(WARNING,
-	(errmsg("could not verify checksum in file \"%s\", block "
-			"%d: read buffer size %d and page size %d "
-			"differ",
+	(errmsg("could not verify checksum in file \"%s\", block %u: read buffer size %d and page size %d differ",
 			readfilename, blkno, (int) cnt, BLCKSZ)));
 			verify_checksum = false;
 		}
@@ -1675,78 +1676,75 @@ sendFile(const char *readfilename, const char *tarfilename,
 page = buf + BLCKSZ * i;
 
 /*
- * Only check pages which have not been modified since the
- * start of the base backup. Otherwise, they might have been
- * written only halfway and the checksum would not be valid.
- * However, replaying WAL would reinstate the correct page in
- * this case. We also skip completely new pages, since they
- * don't have a checksum yet.
+ * Check on-disk pages with the same set of verification
+ * conditions used before loading pages into shared buffers.
+ * Note that PageIsVerifiedExtended() considers pages filled
+ * only with zeros as valid, since they don't have a checksum
+ * yet.  Failures are not reported to pgstat yet, as these are
+ * reported in a single batch once a file is completed.  It
+ * could be possible, that a page is written halfway while
+ * doing this check, causing a false positive.  If that
+ * happens, a page is retried multiple times, with an error
+ * reported if the second attempt also fails.
  */
-if (!PageIsNew(page) && PageGetLSN(page) < startptr)
+if (PageIsVerifiedExtended(page, blkno, 0))
 {
-	checksum = pg_checksum_page((char *) page, blkno + segmentno * RELSEG_SIZE);
-	phdr = (PageHeader) page;
-	if (phdr->pd_checksum != checksum)
+	/* success, move on to the next block */
+	blkno++;
+	block_attempts = 0;
+	continue;
+}
+
+/* The verification of a page has failed, retry again */
+if (block_attempts < PAGE_RETRY_THRESHOLD)
+{
+	int			reread_cnt;
+
+	/* Reread the failed block */
+	reread_cnt =
+		basebackup_read_file(fd, buf + BLCKSZ * i,
+			 BLCKSZ, len + BLCKSZ * i,
+			 readfilename,
+			 false);
+	if (reread_cnt == 0)
 	{
 		/*
-		 * Retry the block on the first failure.  It's
-		 * possible that we read the first 4K page of the
-		 * block just before postgres updated the entire block
-		 * so it ends up looking torn to us.  We only need to
-		 * retry once because the LSN should be updated to
-		 * something we can ignore on the next pass.  If the
-		 * error happens again then it is a true validation
-		 * failure.
+		 * If we hit end-of-file, a concurrent truncation must
+		 * have occurred, so break out of this loop just as if
+		 * the initial fread() returned 0. We'll drop through
+		 * to the same code that handles that case. (We must
+		 * fix up cnt first, though.)
 		 */
-		if (block_retry == false)
-		{
-			int			reread_cnt;
-
-			/* Reread the failed block */
-			

Re: [patch] Fix checksum verification in base backups for zero page headers

2020-10-22 Thread Michael Paquier
On Thu, Oct 22, 2020 at 03:11:45PM +0300, Anastasia Lubennikova wrote:
> Most of such pages are valid and already in memory, because they were
> changed just recently, so no need for pg_prewarm here. If such LSN appeared
> because of a data corruption, page verification from inside ReadBuffer()
> will report an error first. In proposed function, we can handle this error
> in any fashion we want. Something like:
> 
> if (PageGetLSN(page) > startptr)
> {
>     if (!read_page_via_buffercache())
> 
>         //throw a warning about corrupted page
>         //handle checksum error as needed
>     else
>         //page is valid. No worries
> }

Yeah, we could try to make the logic a bit more complicated like
that.  However, for any code path relying on a page read without any
locking insurance, we cannot really have a lot of trust in any of the
fields assigned to the page as this could just be random corruption
garbage, and the only thing I am ready to trust here a checksum
mismatch check, because that's the only field on the page that's
linked to its full contents on the 8k page.  This also keeps the code
simpler.
--
Michael


signature.asc
Description: PGP signature


Re: [patch] Fix checksum verification in base backups for zero page headers

2020-10-22 Thread Anastasia Lubennikova

On 22.10.2020 04:25, Michael Paquier wrote:

On Thu, Oct 22, 2020 at 12:47:03AM +0300, Anastasia Lubennikova wrote:

We can also read such pages via shared buffers to be 100% sure.

Yeah, but this has its limits as well.  One can use
ignore_checksum_failure, but this can actually be very dangerous as
you can finish by loading into shared buffers a page that has a header
thought as sane but with a large portion of its page randomly
corrupted, spreading corruption around and leading to more fancy
logic failures in Postgres, with more panic from customers.  Not using
ignore_checksum_failure is safer, but it makes an analyze of the
damages for a given file harder as things would stop at the first
failure of a file with a seqscan.  pg_prewarm can help here, but
that's not the goal of the tool to do that either.

I was thinking about applying this only to pages with LSN > startLSN.

Most of such pages are valid and already in memory, because they were 
changed just recently, so no need for pg_prewarm here. If such LSN 
appeared because of a data corruption, page verification from inside 
ReadBuffer() will report an error first. In proposed function, we can 
handle this error in any fashion we want. Something like:


if (PageGetLSN(page) > startptr)
{
    if (!read_page_via_buffercache())

        //throw a warning about corrupted page
        //handle checksum error as needed
    else
        //page is valid. No worries
}

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: [patch] Fix checksum verification in base backups for zero page headers

2020-10-22 Thread Michael Paquier
On Thu, Oct 22, 2020 at 02:27:34PM +0800, Julien Rouhaud wrote:
> I'm a bit worried about this approach, as if I understand correctly
> this can lead to false positive reports.  I've certainly seen systems
> with IO stalled for more than 500ms, so while this is not frequent
> this could still happen.

The possibility of false positives is not a new thing with this
feature as currently shaped.  On HEAD, this code actually just does
one retry, without waiting at all for the operation potentially
happening in parallel to finish, so that's actually worse.  And that's
assuming that the pd_lsn of the page did not get touched by a
corruption as we would simply miss a broken page.  So, with a
non-locking approach, we limit ourselves to tweaking the number of
retries and some sleeps :(

I am not sure that increasing the sleep of 100ms is a good thing on
not-so-slow disks, but we could increase the number of retries.  The
patch makes that easier to change at least.  FWIW, I don't like that
this code, with a real risk of false positives, got merged to begin
with, and I think that other people share the same opinion, but it is
not like we can just remove it on a branch already released either..
And I am not sure if we have done such things in the past for stable
branches.  If we were to do that, we could just make the operation a
no-op, and keep some traces of the grammar for compatibility.

> About the patch:
> 
> + * doing this check, causing a false positive.  If that
> + * happens, a page is retried once, with an error reported if
> + * the second attempt also fails.
> 
> [...]
> 
> + /* The verification of a page has failed, retry once */
> + if (block_attempts < PAGE_RETRY_THRESHOLD)
> + {

Oops.  Thanks, I got that changed on my branch.
--
Michael


signature.asc
Description: PGP signature


Re: [patch] Fix checksum verification in base backups for zero page headers

2020-10-22 Thread Julien Rouhaud
On Thu, Oct 22, 2020 at 9:25 AM Michael Paquier  wrote:
>
> On Thu, Oct 22, 2020 at 12:47:03AM +0300, Anastasia Lubennikova wrote:
> > Thank you. I always forget about this. Do we have any checklist for such
> > changes, that patch authors and reviewers can use?
>
> Not really.  That's more a habit than anything else where any
> non-static routine that we publish could be used by some out-of-core
> code, so maintaining a pure API compatibility on stable branches is
> essential.
>
> > We can also read such pages via shared buffers to be 100% sure.
>
> Yeah, but this has its limits as well.  One can use
> ignore_checksum_failure, but this can actually be very dangerous as
> you can finish by loading into shared buffers a page that has a header
> thought as sane but with a large portion of its page randomly
> corrupted, spreading corruption around and leading to more fancy
> logic failures in Postgres, with more panic from customers.  Not using
> ignore_checksum_failure is safer, but it makes an analyze of the
> damages for a given file harder as things would stop at the first
> failure of a file with a seqscan.  pg_prewarm can help here, but
> that's not the goal of the tool to do that either.
>
> We definitely need a different approach that guarantees that a page is
> correctly locked with no concurrent I/O while checked on retry, and I
> am looking at that for the SQL-level check.  That's not something I
> would do in v13 though, but we can make the existing logic much more
> reliable with a set of fixed retries and some sleeps in-between.  A
> maximum of 5 retries with 100ms seems like a good balance seen from
> here, but that's not a perfect science of course depending on the
> hardware latency.
>
> This has been a sensitive topic during the stability period of v13
> with many committers commenting on the issue, so I'd rather be very
> careful that there are no objections to what's published here, and in
> consequence I am going to ping the other thread on the matter.  For
> now I have the attached to address the zero-case and the random LSN
> case.

I'm a bit worried about this approach, as if I understand correctly
this can lead to false positive reports.  I've certainly seen systems
with IO stalled for more than 500ms, so while this is not frequent
this could still happen.

About the patch:

+ * doing this check, causing a false positive.  If that
+ * happens, a page is retried once, with an error reported if
+ * the second attempt also fails.

[...]

+ /* The verification of a page has failed, retry once */
+ if (block_attempts < PAGE_RETRY_THRESHOLD)
+ {

Both those comments seem to refer to the previous "retry only once"
approach.  Apart from that and my concerns on the new heuristics, the
patch looks good, +1 for PageIsVerifiedExtended()!




Re: [patch] Fix checksum verification in base backups for zero page headers

2020-10-21 Thread Michael Paquier
On Thu, Oct 22, 2020 at 12:47:03AM +0300, Anastasia Lubennikova wrote:
> Thank you. I always forget about this. Do we have any checklist for such
> changes, that patch authors and reviewers can use?

Not really.  That's more a habit than anything else where any
non-static routine that we publish could be used by some out-of-core 
code, so maintaining a pure API compatibility on stable branches is
essential.

> We can also read such pages via shared buffers to be 100% sure.

Yeah, but this has its limits as well.  One can use
ignore_checksum_failure, but this can actually be very dangerous as
you can finish by loading into shared buffers a page that has a header
thought as sane but with a large portion of its page randomly
corrupted, spreading corruption around and leading to more fancy
logic failures in Postgres, with more panic from customers.  Not using
ignore_checksum_failure is safer, but it makes an analyze of the
damages for a given file harder as things would stop at the first
failure of a file with a seqscan.  pg_prewarm can help here, but
that's not the goal of the tool to do that either.

We definitely need a different approach that guarantees that a page is
correctly locked with no concurrent I/O while checked on retry, and I
am looking at that for the SQL-level check.  That's not something I
would do in v13 though, but we can make the existing logic much more
reliable with a set of fixed retries and some sleeps in-between.  A
maximum of 5 retries with 100ms seems like a good balance seen from
here, but that's not a perfect science of course depending on the
hardware latency.

This has been a sensitive topic during the stability period of v13
with many committers commenting on the issue, so I'd rather be very
careful that there are no objections to what's published here, and in
consequence I am going to ping the other thread on the matter.  For
now I have the attached to address the zero-case and the random LSN
case.
--
Michael
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 51b8f994ac..1a28ee1130 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -404,26 +404,33 @@ do { \
  *		extern declarations
  * 
  */
+
+/* flags for PageAddItemExtended() */
 #define PAI_OVERWRITE			(1 << 0)
 #define PAI_IS_HEAP(1 << 1)
 
+/* flags for PageIsVerifiedExtended() */
+#define PIV_LOG_WARNING			(1 << 0)
+#define PIV_REPORT_STAT			(1 << 1)
+
 #define PageAddItem(page, item, size, offsetNumber, overwrite, is_heap) \
 	PageAddItemExtended(page, item, size, offsetNumber, \
 		((overwrite) ? PAI_OVERWRITE : 0) | \
 		((is_heap) ? PAI_IS_HEAP : 0))
 
 /*
- * Check that BLCKSZ is a multiple of sizeof(size_t).  In PageIsVerified(),
- * it is much faster to check if a page is full of zeroes using the native
- * word size.  Note that this assertion is kept within a header to make
- * sure that StaticAssertDecl() works across various combinations of
- * platforms and compilers.
+ * Check that BLCKSZ is a multiple of sizeof(size_t).  In
+ * PageIsVerifiedExtended(), it is much faster to check if a page is
+ * full of zeroes using the native word size.  Note that this assertion
+ * is kept within a header to make sure that StaticAssertDecl() works
+ * across various combinations of platforms and compilers.
  */
 StaticAssertDecl(BLCKSZ == ((BLCKSZ / sizeof(size_t)) * sizeof(size_t)),
  "BLCKSZ has to be a multiple of sizeof(size_t)");
 
 extern void PageInit(Page page, Size pageSize, Size specialSize);
 extern bool PageIsVerified(Page page, BlockNumber blkno);
+extern bool PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags);
 extern OffsetNumber PageAddItemExtended(Page page, Item item, Size size,
 		OffsetNumber offsetNumber, int flags);
 extern Page PageGetTempPage(Page page);
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index dbbd3aa31f..d538f25726 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -443,7 +443,8 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
 
 		smgrread(src, forkNum, blkno, buf.data);
 
-		if (!PageIsVerified(page, blkno))
+		if (!PageIsVerifiedExtended(page, blkno,
+	PIV_LOG_WARNING | PIV_REPORT_STAT))
 			ereport(ERROR,
 	(errcode(ERRCODE_DATA_CORRUPTED),
 	 errmsg("invalid page in block %u of relation %s",
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index b89df01fa7..6f717c8956 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1569,21 +1569,24 @@ sendFile(const char *readfilename, const char *tarfilename,
 {
 	int			fd;
 	BlockNumber blkno = 0;
-	bool		block_retry = false;
+	int			block_attempts = 0;
 	char		buf[TAR_SEND_SIZE];
-	uint16		checksum;
 	int			checksum_failures = 0;
 	off_t		cnt;
 	int			i;
 	pgoff_t		len = 0;
 	char	   *page;
 	size_t		pad;
-	

Re: [patch] Fix checksum verification in base backups for zero page headers

2020-10-21 Thread Anastasia Lubennikova

On 20.10.2020 09:24, Michael Paquier wrote:

On Fri, Oct 16, 2020 at 06:02:52PM +0300, Anastasia Lubennikova wrote:

In the current patch, PageIsVerifed called from pgbasebackup simply doesn't
Should we change this too? I don't see any difference.

I considered that, but now that does not seem worth bothering here.


Done.

Thanks for the updated patch.  I have played with dd and some fake
files to check the validity of the zero-case (as long as pd_lsn does
not go wild).  Here are some comments, with an updated patch attached:
- Added a PageIsVerifiedExtended to make this patch back-patchable,
with the original routine keeping its original shape.
Thank you. I always forget about this. Do we have any checklist for such 
changes, that patch authors and reviewers can use?

- I did not like much to show the WARNING from PageIsVerified() for
the report, and we'd lose some context related to the base backups.
The important part is to know which blocks from which files are found
as problematic.
- Switched the checks to have PageIsVerified() placed first in the
hierarchy, so as we do all the basic validity checks before a look at
the LSN.  This is not really change things logically.
- The meaning of block_retry is also the opposite of what it should be
in the original code?  IMO, the flag should be set to true if we still
are fine to retry a block, and set it to false once the one-time retry
has failed.


Agree.


- The error strings are not really consistent with the project style
in this area.  These are usually not spawned across multiple lines to
ease searches with grep or such.
Same question as above. I don't see this info about formatting in the 
error message style guide in documentation. Is it mentioned somewhere else?

Anastasia, Michael B, does that look OK to you?

The final patch looks good to me.

NB: This patch fixes only one problem, the zero-page case, as it was
the intention of Michael B to split this part into its own thread.  We
still have, of course, a second problem when it comes to a random LSN
put into the page header which could mask an actual checksum failure
so this only takes care of half the issues.  Having a correct LSN
approximation is a tricky problem IMO, but we could improve things by
having some delta with an extra WAL page on top of GetInsertRecPtr().
And this function cannot be used for base backups taken from
standbys.


We can also read such pages via shared buffers to be 100% sure.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: [patch] Fix checksum verification in base backups for zero page headers

2020-10-20 Thread Michael Paquier
On Fri, Oct 16, 2020 at 06:02:52PM +0300, Anastasia Lubennikova wrote:
> In the current patch, PageIsVerifed called from pgbasebackup simply doesn't
> Should we change this too? I don't see any difference.

I considered that, but now that does not seem worth bothering here.

> Done.

Thanks for the updated patch.  I have played with dd and some fake
files to check the validity of the zero-case (as long as pd_lsn does
not go wild).  Here are some comments, with an updated patch attached:
- Added a PageIsVerifiedExtended to make this patch back-patchable,
with the original routine keeping its original shape.
- I did not like much to show the WARNING from PageIsVerified() for
the report, and we'd lose some context related to the base backups.
The important part is to know which blocks from which files are found
as problematic.
- Switched the checks to have PageIsVerified() placed first in the
hierarchy, so as we do all the basic validity checks before a look at
the LSN.  This is not really change things logically.
- The meaning of block_retry is also the opposite of what it should be
in the original code?  IMO, the flag should be set to true if we still
are fine to retry a block, and set it to false once the one-time retry
has failed.
- Moved the hardcoded failure report threshold of 5 into its own
variable.
- The error strings are not really consistent with the project style
in this area.  These are usually not spawned across multiple lines to
ease searches with grep or such.

Anastasia, Michael B, does that look OK to you?

NB: This patch fixes only one problem, the zero-page case, as it was
the intention of Michael B to split this part into its own thread.  We
still have, of course, a second problem when it comes to a random LSN
put into the page header which could mask an actual checksum failure
so this only takes care of half the issues.  Having a correct LSN
approximation is a tricky problem IMO, but we could improve things by
having some delta with an extra WAL page on top of GetInsertRecPtr().
And this function cannot be used for base backups taken from
standbys.
--
Michael
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 51b8f994ac..1a28ee1130 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -404,26 +404,33 @@ do { \
  *		extern declarations
  * 
  */
+
+/* flags for PageAddItemExtended() */
 #define PAI_OVERWRITE			(1 << 0)
 #define PAI_IS_HEAP(1 << 1)
 
+/* flags for PageIsVerifiedExtended() */
+#define PIV_LOG_WARNING			(1 << 0)
+#define PIV_REPORT_STAT			(1 << 1)
+
 #define PageAddItem(page, item, size, offsetNumber, overwrite, is_heap) \
 	PageAddItemExtended(page, item, size, offsetNumber, \
 		((overwrite) ? PAI_OVERWRITE : 0) | \
 		((is_heap) ? PAI_IS_HEAP : 0))
 
 /*
- * Check that BLCKSZ is a multiple of sizeof(size_t).  In PageIsVerified(),
- * it is much faster to check if a page is full of zeroes using the native
- * word size.  Note that this assertion is kept within a header to make
- * sure that StaticAssertDecl() works across various combinations of
- * platforms and compilers.
+ * Check that BLCKSZ is a multiple of sizeof(size_t).  In
+ * PageIsVerifiedExtended(), it is much faster to check if a page is
+ * full of zeroes using the native word size.  Note that this assertion
+ * is kept within a header to make sure that StaticAssertDecl() works
+ * across various combinations of platforms and compilers.
  */
 StaticAssertDecl(BLCKSZ == ((BLCKSZ / sizeof(size_t)) * sizeof(size_t)),
  "BLCKSZ has to be a multiple of sizeof(size_t)");
 
 extern void PageInit(Page page, Size pageSize, Size specialSize);
 extern bool PageIsVerified(Page page, BlockNumber blkno);
+extern bool PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags);
 extern OffsetNumber PageAddItemExtended(Page page, Item item, Size size,
 		OffsetNumber offsetNumber, int flags);
 extern Page PageGetTempPage(Page page);
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index dbbd3aa31f..d538f25726 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -443,7 +443,8 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
 
 		smgrread(src, forkNum, blkno, buf.data);
 
-		if (!PageIsVerified(page, blkno))
+		if (!PageIsVerifiedExtended(page, blkno,
+	PIV_LOG_WARNING | PIV_REPORT_STAT))
 			ereport(ERROR,
 	(errcode(ERRCODE_DATA_CORRUPTED),
 	 errmsg("invalid page in block %u of relation %s",
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index b89df01fa7..d1eb350b60 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1569,21 +1569,22 @@ sendFile(const char *readfilename, const char *tarfilename,
 {
 	int			fd;
 	BlockNumber blkno = 0;
-	bool		block_retry = false;
+	bool		block_retry = true;
 	char		

Re: [patch] Fix checksum verification in base backups for zero page headers

2020-10-16 Thread Anastasia Lubennikova

On 07.10.2020 11:18, Michael Paquier wrote:

On Fri, Sep 25, 2020 at 08:53:27AM +0200, Michael Banck wrote:

Oh right, I've fixed up the commit message in the attached V4.

Not much a fan of what's proposed here, for a couple of reasons:
- If the page is not new, we should check if the header is sane or
not.
- It may be better to actually count checksum failures if these are
repeatable in a given block, and report them.
- The code would be more simple with a "continue" added for a block
detected as new or with a LSN newer than the backup start.
- The new error messages are confusing, and I think that these are
hard to translate in a meaningful way.

I am interested in moving this patch forward, so here is the updated v5.
This version uses PageIsVerified. All error messages are left unchanged.

So I think that we should try to use PageIsVerified() directly in the
code path of basebackup.c, and this requires a couple of changes to
make the routine more extensible:
- Addition of a dboid argument for pgstat_report_checksum_failure(),
whose call needs to be changed to use the *_in_db() flavor.


In the current patch, PageIsVerifed called from pgbasebackup simply 
doesn't report failures to pgstat.
Because in basebackup.c we already report checksum failures separately. 
Once per file.

        pgstat_report_checksum_failures_in_db(dboid, checksum_failures);

Should we change this too? I don't see any difference.


- Addition of an option to decide if a log should be generated or
not.
- Addition of an option to control if a checksum failure should be
reported to pgstat or not, even if this is actually linked to the
previous point, I have seen cases where being able to control both
separately would be helpful, particularly the log part.

For the last two ones, I would use an extensible argument based on
bits32 with a set of flags that the caller can set at will.

Done.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

commit 553d62dbcdec1a00139f3c5ab6a325ed857b6c9d
Author: anastasia 
Date:   Fri Oct 16 17:36:02 2020 +0300

Fix checksum verification in base backups for zero page headers

We currently do not checksum a page if it is considered new by PageIsNew().
However, this means that if the whole page header is zero, PageIsNew() will
consider that page new due to the pd_upper field being zero and no subsequent
checksum verification is done.

To fix, use PageIsVerified() in pg_basebackup code.

Reported-By: Andres Freund
Author: Michael Banck
Reviewed-By: Asif Rehman, Anastasia Lubennikova
Discussion: https://postgr.es/m/20190319200050.ncuxejradurja...@alap3.anarazel.de

diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index dbbd3aa31f..b4e050c9b8 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -443,7 +443,7 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
 
 		smgrread(src, forkNum, blkno, buf.data);
 
-		if (!PageIsVerified(page, blkno))
+		if (!PageIsVerified(page, blkno, PIV_LOG_WARNING | PIV_REPORT_STAT))
 			ereport(ERROR,
 	(errcode(ERRCODE_DATA_CORRUPTED),
 	 errmsg("invalid page in block %u of relation %s",
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index b89df01fa7..aa8d52c1aa 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1682,11 +1682,14 @@ sendFile(const char *readfilename, const char *tarfilename,
  * this case. We also skip completely new pages, since they
  * don't have a checksum yet.
  */
-if (!PageIsNew(page) && PageGetLSN(page) < startptr)
+if (PageGetLSN(page) < startptr)
 {
-	checksum = pg_checksum_page((char *) page, blkno + segmentno * RELSEG_SIZE);
-	phdr = (PageHeader) page;
-	if (phdr->pd_checksum != checksum)
+	int piv_flags = (checksum_failures < 5) ? (PIV_LOG_WARNING) : 0;
+	/*
+	 * Do not report checksum failures to pgstats from
+	 *  PageIsVerified, since we will do it later.
+	 */
+	if (!PageIsVerified(page, blkno, piv_flags))
 	{
 		/*
 		 * Retry the block on the first failure.  It's
@@ -1732,13 +1735,6 @@ sendFile(const char *readfilename, const char *tarfilename,
 
 		checksum_failures++;
 
-		if (checksum_failures <= 5)
-			ereport(WARNING,
-	(errmsg("checksum verification failed in "
-			"file \"%s\", block %d: calculated "
-			"%X but expected %X",
-			readfilename, blkno, checksum,
-			phdr->pd_checksum)));
 		if (checksum_failures == 5)
 			ereport(WARNING,
 	(errmsg("further checksum verification "
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index e549fa1d30..971c96c7de 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -917,7 +917,7 @@ ReadBuffer_common(SMgrRelation 

Re: [patch] Fix checksum verification in base backups for zero page headers

2020-10-07 Thread Michael Paquier
On Fri, Sep 25, 2020 at 08:53:27AM +0200, Michael Banck wrote:
> Oh right, I've fixed up the commit message in the attached V4.

Not much a fan of what's proposed here, for a couple of reasons:
- If the page is not new, we should check if the header is sane or
not.
- It may be better to actually count checksum failures if these are
repeatable in a given block, and report them.
- The code would be more simple with a "continue" added for a block
detected as new or with a LSN newer than the backup start.
- The new error messages are confusing, and I think that these are
hard to translate in a meaningful way.

So I think that we should try to use PageIsVerified() directly in the
code path of basebackup.c, and this requires a couple of changes to
make the routine more extensible:
- Addition of a dboid argument for pgstat_report_checksum_failure(),
whose call needs to be changed to use the *_in_db() flavor.
- Addition of an option to decide if a log should be generated or
not.
- Addition of an option to control if a checksum failure should be
reported to pgstat or not, even if this is actually linked to the
previous point, I have seen cases where being able to control both
separately would be helpful, particularly the log part.

For the last two ones, I would use an extensible argument based on
bits32 with a set of flags that the caller can set at will.  Something
else I would recommend here is to use an error context for the case
where we want to log a retry number in the base backup loop that
accepts false positives, with the blkno and the physical file name
when we find failures after retries.
--
Michael


signature.asc
Description: PGP signature


Re: [patch] Fix checksum verification in base backups for zero page headers

2020-09-25 Thread Michael Banck
Hi,

Am Donnerstag, den 24.09.2020, 17:24 +0300 schrieb Anastasia
Lubennikova:
> So I mark this one as ReadyForCommitter.

Thanks!

> The only minor problem is a typo (?) in the proposed commit message.
> "If a page is all zero, consider that a checksum failure." It should be 
> "If a page is NOT all zero...".

Oh right, I've fixed up the commit message in the attached V4.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
From af83f4a42403e8a994e101086affafa86d67b52a Mon Sep 17 00:00:00 2001
From: Michael Banck 
Date: Tue, 22 Sep 2020 16:09:36 +0200
Subject: [PATCH] Fix checksum verification in base backups for zero page
 headers

We currently do not checksum a page if it is considered new by PageIsNew().
However, this means that if the whole page header is zero, PageIsNew() will
consider that page new due to the pd_upper field being zero and no subsequent
checksum verification is done.

To fix, factor out the all-zeroes check from PageIsVerified() into a new method
PageIsZero() and call that for pages considered new by PageIsNew(). If a page
is not totally empty, consider that a checksum failure.

Add one test to the pg_basebackup TAP tests to check this error.

Reported-By: Andres Freund
Reviewed-By: Asif Rehman, Anastasia Lubennikova
Discussion: https://postgr.es/m/20190319200050.ncuxejradurja...@alap3.anarazel.de
---
 src/backend/replication/basebackup.c | 30 +
 src/backend/storage/page/bufpage.c   | 35 +++-
 src/bin/pg_basebackup/t/010_pg_basebackup.pl | 18 +-
 src/include/storage/bufpage.h| 11 +++---
 4 files changed, 73 insertions(+), 21 deletions(-)

diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index b89df01fa7..c82765a429 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1746,6 +1746,36 @@ sendFile(const char *readfilename, const char *tarfilename,
 			"be reported", readfilename)));
 	}
 }
+else
+{
+	/*
+	 * We skip completely new pages after checking they are
+	 * all-zero, since they don't have a checksum yet.
+	 */
+	if (PageIsNew(page) && !PageIsZero(page))
+	{
+		/*
+		 * pd_upper is zero, but the page is not all zero.  We
+		 * cannot run pg_checksum_page() on the page as it
+		 * would throw an assertion failure.  Consider this a
+		 * checksum failure.
+		 */
+		checksum_failures++;
+
+		if (checksum_failures <= 5)
+			ereport(WARNING,
+	(errmsg("checksum verification failed in "
+			"file \"%s\", block %d: pd_upper "
+			"is zero but page is not all-zero",
+			readfilename, blkno)));
+		if (checksum_failures == 5)
+			ereport(WARNING,
+	(errmsg("further checksum verification "
+			"failures in file \"%s\" will not "
+			"be reported", readfilename)));
+	}
+}
+
 block_retry = false;
 blkno++;
 			}
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 4bc2bf955d..8be3cd6190 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -82,11 +82,8 @@ bool
 PageIsVerified(Page page, BlockNumber blkno)
 {
 	PageHeader	p = (PageHeader) page;
-	size_t	   *pagebytes;
-	int			i;
 	bool		checksum_failure = false;
 	bool		header_sane = false;
-	bool		all_zeroes = false;
 	uint16		checksum = 0;
 
 	/*
@@ -120,18 +117,7 @@ PageIsVerified(Page page, BlockNumber blkno)
 	}
 
 	/* Check all-zeroes case */
-	all_zeroes = true;
-	pagebytes = (size_t *) page;
-	for (i = 0; i < (BLCKSZ / sizeof(size_t)); i++)
-	{
-		if (pagebytes[i] != 0)
-		{
-			all_zeroes = false;
-			break;
-		}
-	}
-
-	if (all_zeroes)
+	if (PageIsZero(page))
 		return true;
 
 	/*
@@ -154,6 +140,25 @@ PageIsVerified(Page page, BlockNumber blkno)
 	return false;
 }
 
+/*
+ * PageIsZero
+ *		Check that the page consists only of zero bytes.
+ *
+ */
+bool
+PageIsZero(Page page)
+{
+	int			i;
+	size_t	   *pagebytes = (size_t *) page;
+
+	for (i = 0; i < (BLCKSZ / sizeof(size_t)); i++)
+	{
+		if (pagebytes[i] != 0)
+			return false;
+	}
+
+	return true;
+}
 
 /*
  *	PageAddItemExtended
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index f674a7c94e..f5735569c5 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -6,7 +6,7 @@ us

Re: [patch] Fix checksum verification in base backups for zero page headers

2020-09-24 Thread Anastasia Lubennikova



On 22.09.2020 17:30, Michael Banck wrote:

Hi,

Am Dienstag, den 22.09.2020, 16:26 +0200 schrieb Michael Banck:

Am Mittwoch, den 02.09.2020, 16:50 +0300 schrieb Anastasia Lubennikova:

I've looked through the previous discussion. As far as I got it, most of
the controversy was about online checksums improvements.

The warning about pd_upper inconsistency that you've added is a good
addition. The patch is a bit messy, though, because a huge code block
was shifted.

Will it be different, if you just leave
  "if (!PageIsNew(page) && PageGetLSN(page) < startptr)"
block as it was, and add
  "else if (PageIsNew(page) && !PageIsZero(page))" ?

Thanks, that does indeed look better as a patch and I think it's fine
as-is for the code as well, I've attached a v2.

Sorry, forgot to add you as reviewer in the proposed commit message,
I've fixed that up now in V3.


Michael


Great. This version looks good to me.
Thank you for answering my questions, I agree, that we can work on them 
in separate threads.


So I mark this one as ReadyForCommitter.

The only minor problem is a typo (?) in the proposed commit message.
"If a page is all zero, consider that a checksum failure." It should be 
"If a page is NOT all zero...".


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: [patch] Fix checksum verification in base backups for zero page headers

2020-09-22 Thread Michael Banck
Hi,

Am Dienstag, den 22.09.2020, 16:26 +0200 schrieb Michael Banck:
> Am Mittwoch, den 02.09.2020, 16:50 +0300 schrieb Anastasia Lubennikova:
> > I've looked through the previous discussion. As far as I got it, most of 
> > the controversy was about online checksums improvements.
> > 
> > The warning about pd_upper inconsistency that you've added is a good 
> > addition. The patch is a bit messy, though, because a huge code block 
> > was shifted.
> > 
> > Will it be different, if you just leave
> >  "if (!PageIsNew(page) && PageGetLSN(page) < startptr)"
> > block as it was, and add
> >  "else if (PageIsNew(page) && !PageIsZero(page))" ?
> 
> Thanks, that does indeed look better as a patch and I think it's fine
> as-is for the code as well, I've attached a v2.

Sorry, forgot to add you as reviewer in the proposed commit message,
I've fixed that up now in V3.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
From a6706ec63709137881d415a8acf98c390a39ee56 Mon Sep 17 00:00:00 2001
From: Michael Banck 
Date: Tue, 22 Sep 2020 16:09:36 +0200
Subject: [PATCH] Fix checksum verification in base backups for zero page
 headers

We currently do not checksum a page if it is considered new by PageIsNew().
However, this means that if the whole page header is zero, PageIsNew() will
consider that page new due to the pd_upper field being zero and no subsequent
checksum verification is done.

To fix, factor out the all-zeroes check from PageIsVerified() into a new method
PageIsZero() and call that for pages considered new by PageIsNew(). If a page
is all zero, consider that a checksum failure.

Add one more test to the pg_basebackup TAP tests to check this error.

Reported-By: Andres Freund
Reviewed-By: Asif Rehman, Anastasia Lubennikova
Discussion: https://postgr.es/m/20190319200050.ncuxejradurja...@alap3.anarazel.de
---
 src/backend/replication/basebackup.c | 30 +
 src/backend/storage/page/bufpage.c   | 35 +++-
 src/bin/pg_basebackup/t/010_pg_basebackup.pl | 18 +-
 src/include/storage/bufpage.h| 11 +++---
 4 files changed, 73 insertions(+), 21 deletions(-)

diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index b89df01fa7..c82765a429 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1746,6 +1746,36 @@ sendFile(const char *readfilename, const char *tarfilename,
 			"be reported", readfilename)));
 	}
 }
+else
+{
+	/*
+	 * We skip completely new pages after checking they are
+	 * all-zero, since they don't have a checksum yet.
+	 */
+	if (PageIsNew(page) && !PageIsZero(page))
+	{
+		/*
+		 * pd_upper is zero, but the page is not all zero.  We
+		 * cannot run pg_checksum_page() on the page as it
+		 * would throw an assertion failure.  Consider this a
+		 * checksum failure.
+		 */
+		checksum_failures++;
+
+		if (checksum_failures <= 5)
+			ereport(WARNING,
+	(errmsg("checksum verification failed in "
+			"file \"%s\", block %d: pd_upper "
+			"is zero but page is not all-zero",
+			readfilename, blkno)));
+		if (checksum_failures == 5)
+			ereport(WARNING,
+	(errmsg("further checksum verification "
+			"failures in file \"%s\" will not "
+			"be reported", readfilename)));
+	}
+}
+
 block_retry = false;
 blkno++;
 			}
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 4bc2bf955d..8be3cd6190 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -82,11 +82,8 @@ bool
 PageIsVerified(Page page, BlockNumber blkno)
 {
 	PageHeader	p = (PageHeader) page;
-	size_t	   *pagebytes;
-	int			i;
 	bool		checksum_failure = false;
 	bool		header_sane = false;
-	bool		all_zeroes = false;
 	uint16		checksum = 0;
 
 	/*
@@ -120,18 +117,7 @@ PageIsVerified(Page page, BlockNumber blkno)
 	}
 
 	/* Check all-zeroes case */
-	all_zeroes = true;
-	pagebytes = (size_t *) page;
-	for (i = 0; i < (BLCKSZ / sizeof(size_t)); i++)
-	{
-		if (pagebytes[i] != 0)
-		{
-			all_zeroes = false;
-			break;
-		}
-	}
-
-	if (all_zeroes)
+	if (PageIsZero(page))
 		return true;
 
 	/*
@@ -154,6 +140,25 @@ PageIsVerified(Page page, BlockNumber blkno)
 	return false;

Re: [patch] Fix checksum verification in base backups for zero page headers

2020-09-22 Thread Michael Banck
Hi,

Am Mittwoch, den 02.09.2020, 16:50 +0300 schrieb Anastasia Lubennikova:
> On 01.09.2020 13:22, Michael Banck wrote:
> > as a continuation of [1], I've split-off the zero page header case from
> > the last patch, as this one seems less contentious.
> > [1] https://commitfest.postgresql.org/28/2308/
> 
> I've looked through the previous discussion. As far as I got it, most of 
> the controversy was about online checksums improvements.
> 
> The warning about pd_upper inconsistency that you've added is a good 
> addition. The patch is a bit messy, though, because a huge code block 
> was shifted.
> 
> Will it be different, if you just leave
>  "if (!PageIsNew(page) && PageGetLSN(page) < startptr)"
> block as it was, and add
>  "else if (PageIsNew(page) && !PageIsZero(page))" ?

Thanks, that does indeed look better as a patch and I think it's fine
as-is for the code as well, I've attached a v2.

> While on it, I also have a few questions about the code around:

All fair points, but I think those should be handled in another patch,
if any.

> 1) Maybe we can use other header sanity checks from PageIsVerified() as 
> well? Or even the function itself.

Yeah, I'll keep that in mind.

> 2) > /* Reset loop to validate the block again */
> How many times do we try to reread the block? Is one reread enough?

Not sure whether more rereads would help, but I think the general
direction was to replace this with something better anyway I think (see
below).

> Speaking of which, 'reread_cnt' name looks confusing to me. I would 
> expect that this variable contains a number of attempts, not the number 
> of bytes read.

Well, there are other "cnt" variables that keep the number of bytes read
in that file, so it does not seem to be out of place for me. A comment
might not hurt though.

On second glance, it should maybe also be of size_t and not int type, as
is the other cnt variable?

> If everyone agrees, that for basebackup purpose it's enough to rely on a 
> single reread, I'm ok with it.
> Another approach is to read the page directly from shared buffers to 
> ensure that the page is fine. This way is more complicated, but should 
> be almost bullet-proof. Using it we can also check pages with lsn >= 
> startptr.

Right, that's what Andres also advocated for initially I think, and what
should be done going forward.

> 3) Judging by warning messages, we count checksum failures per file, not 
> per page, and don't report after a fifth failure. Why so?  Is it a 
> common case that so many pages of one file are corrupted?

I think this was a request during the original review, on the basis that
if we have more than a few checksum errors, it's likely there is
something fundamentally broken and it does not make sense to spam the
log with potentially millions of broken checksums.


Cheers,

Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
From 317df17289d4fd057ffdb4410f9effa8c7a2b975 Mon Sep 17 00:00:00 2001
From: Michael Banck 
Date: Tue, 22 Sep 2020 16:09:36 +0200
Subject: [PATCH] Fix checksum verification in base backups for zero page
 headers

We currently do not checksum a page if it is considered new by PageIsNew().
However, this means that if the whole page header is zero, PageIsNew() will
consider that page new due to the pd_upper field being zero and no subsequent
checksum verification is done.

To fix, factor out the all-zeroes check from PageIsVerified() into a new method
PageIsZero() and call that for pages considered new by PageIsNew(). If a page
is all zero, consider that a checksum failure.

Add one more test to the pg_basebackup TAP tests to check this error.

Reported-By: Andres Freund
Reviewed-By: Asif Rehman
Discussion: https://postgr.es/m/20190319200050.ncuxejradurja...@alap3.anarazel.de
---
 src/backend/replication/basebackup.c | 30 +
 src/backend/storage/page/bufpage.c   | 35 +++-
 src/bin/pg_basebackup/t/010_pg_basebackup.pl | 18 +-
 src/include/storage/bufpage.h| 11 +++---
 4 files changed, 73 insertions(+), 21 deletions(-)

diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index b89df01fa7..c82765a429 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1746,6 +1746,36 @@ sendFile(const char *readfilename, const char *tarfilename,
 			"be reported", readfilename)));
 	}
 }
+e

Re: [patch] Fix checksum verification in base backups for zero page headers

2020-09-16 Thread Michael Paquier
On Wed, Sep 02, 2020 at 04:50:14PM +0300, Anastasia Lubennikova wrote:
> I've looked through the previous discussion. As far as I got it, most of the
> controversy was about online checksums improvements.

This patch is waiting on author for two weeks now.  Michael, could you
reply to the points raised by Anastasia?
--
Michael


signature.asc
Description: PGP signature


Re: [patch] Fix checksum verification in base backups for zero page headers

2020-09-02 Thread Anastasia Lubennikova

On 01.09.2020 13:22, Michael Banck wrote:

Hi,

as a continuation of [1], I've split-off the zero page header case from
the last patch, as this one seems less contentious.


Michael

[1] https://commitfest.postgresql.org/28/2308/

I've looked through the previous discussion. As far as I got it, most of 
the controversy was about online checksums improvements.


The warning about pd_upper inconsistency that you've added is a good 
addition. The patch is a bit messy, though, because a huge code block 
was shifted.


Will it be different, if you just leave
    "if (!PageIsNew(page) && PageGetLSN(page) < startptr)"
block as it was, and add
    "else if (PageIsNew(page) && !PageIsZero(page))" ?


While on it, I also have a few questions about the code around:

1) Maybe we can use other header sanity checks from PageIsVerified() as 
well? Or even the function itself.


2) > /* Reset loop to validate the block again */
How many times do we try to reread the block? Is one reread enough?
Speaking of which, 'reread_cnt' name looks confusing to me. I would 
expect that this variable contains a number of attempts, not the number 
of bytes read.


If everyone agrees, that for basebackup purpose it's enough to rely on a 
single reread, I'm ok with it.
Another approach is to read the page directly from shared buffers to 
ensure that the page is fine. This way is more complicated, but should 
be almost bullet-proof. Using it we can also check pages with lsn >= 
startptr.


3) Judging by warning messages, we count checksum failures per file, not 
per page, and don't report after a fifth failure. Why so?  Is it a 
common case that so many pages of one file are corrupted?


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





[patch] Fix checksum verification in base backups for zero page headers

2020-09-01 Thread Michael Banck
Hi,

as a continuation of [1], I've split-off the zero page header case from
the last patch, as this one seems less contentious.


Michael

[1] https://commitfest.postgresql.org/28/2308/

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
From 8784d27ff258a6ff1d80f18b22e2b275a3944991 Mon Sep 17 00:00:00 2001
From: Michael Banck 
Date: Tue, 1 Sep 2020 12:14:16 +0200
Subject: [PATCH] Fix checksum verification in base backups for zero page
 headers

We currently do not checksum a page if it is considered new by PageIsNew().
However, this means that if the whole page header is zero, PageIsNew() will
consider that page new due to the pd_upper field being zero and no subsequent
checksum verification is done.

To fix, factor out the all-zeroes check from PageIsVerified() into a new method
PageIsZero() and call that for pages considered new by PageIsNew(). If a page
is all zero, consider that a checksum failure.

Add one more test to the pg_basebackup TAP tests to check this error.

Reported-By: Andres Freund
Reviewed-By: Asif Rehman
Discussion: https://postgr.es/m/20190319200050.ncuxejradurja...@alap3.anarazel.de
---
 src/backend/replication/basebackup.c | 138 ---
 src/backend/storage/page/bufpage.c   |  35 +++--
 src/bin/pg_basebackup/t/010_pg_basebackup.pl |  18 ++-
 src/include/storage/bufpage.h|  11 +-
 4 files changed, 128 insertions(+), 74 deletions(-)

diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 6064384e32..30cd5905ed 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1672,70 +1672,28 @@ sendFile(const char *readfilename, const char *tarfilename,
 page = buf + BLCKSZ * i;
 
 /*
- * Only check pages which have not been modified since the
- * start of the base backup. Otherwise, they might have been
- * written only halfway and the checksum would not be valid.
- * However, replaying WAL would reinstate the correct page in
- * this case. We also skip completely new pages, since they
- * don't have a checksum yet.
+ * We skip completely new pages after checking they are
+ * all-zero, since they don't have a checksum yet.
  */
-if (!PageIsNew(page) && PageGetLSN(page) < startptr)
+if (PageIsNew(page))
 {
-	checksum = pg_checksum_page((char *) page, blkno + segmentno * RELSEG_SIZE);
-	phdr = (PageHeader) page;
-	if (phdr->pd_checksum != checksum)
+	if (!PageIsZero(page))
 	{
 		/*
-		 * Retry the block on the first failure.  It's
-		 * possible that we read the first 4K page of the
-		 * block just before postgres updated the entire block
-		 * so it ends up looking torn to us.  We only need to
-		 * retry once because the LSN should be updated to
-		 * something we can ignore on the next pass.  If the
-		 * error happens again then it is a true validation
-		 * failure.
+		 * pd_upper is zero, but the page is not all zero.  We
+		 * cannot run pg_checksum_page() on the page as it
+		 * would throw an assertion failure.  Consider this a
+		 * checksum failure.
 		 */
-		if (block_retry == false)
-		{
-			int			reread_cnt;
-
-			/* Reread the failed block */
-			reread_cnt =
-basebackup_read_file(fd, buf + BLCKSZ * i,
-	 BLCKSZ, len + BLCKSZ * i,
-	 readfilename,
-	 false);
-			if (reread_cnt == 0)
-			{
-/*
- * If we hit end-of-file, a concurrent
- * truncation must have occurred, so break out
- * of this loop just as if the initial fread()
- * returned 0. We'll drop through to the same
- * code that handles that case. (We must fix
- * up cnt first, though.)
- */
-cnt = BLCKSZ * i;
-break;
-			}
-
-			/* Set flag so we know a retry was attempted */
-			block_retry = true;
-
-			/* Reset loop to validate the block again */
-			i--;
-			continue;
-		}
 
 		checksum_failures++;
 
 		if (checksum_failures <= 5)
 			ereport(WARNING,
 	(errmsg("checksum verification failed in "
-			"file \"%s\", block %d: calculated "
-			"%X but expected %X",
-			readfilename, blkno, checksum,
-			phdr->pd_checksum)));
+			"file \"%s\", block %d: pd_upper "
+			"is zero but page is not all-zero",
+			readfilename, blkno)));
 		if (checksum_failures == 5)