If pg_basebackup is not able to read BLCKSZ content from file, then it
just emits a warning "could not verify checksum in file "____" block
X: read buffer size X and page size 8192 differ" currently but misses
to error with "checksum error occurred". Only if it can read 8192 and
checksum mismatch happens will it error in the end.

Repro is pretty simple:
/usr/local/pgsql/bin/initdb -k /tmp/postgres
/usr/local/pgsql/bin/pg_ctl -D /tmp/postgres -l /tmp/logfile start
# just create random file of size not in multiple of 8192
echo "corruption" > /tmp/postgres/base/12696/44444

Without the fix:
$ /usr/local/pgsql/bin/pg_basebackup -D /tmp/dummy
WARNING:  could not verify checksum in file "./base/12696/44444", block 0:
read buffer size 11 and page size 8192 differ
$ echo $?
0

With the fix:
$ /usr/local/pgsql/bin/pg_basebackup -D /tmp/dummy
WARNING:  could not verify checksum in file "./base/12696/44444", block 0:
read buffer size 11 and page size 8192 differ
pg_basebackup: error: checksum error occurred
$ echo $?
1


I think it's an important case to be handled and should not be silently
skipped,
unless I am missing something. This one liner should fix it:

diff --git a/src/backend/replication/basebackup.c
b/src/backend/replication/basebackup.c
index fbdc28ec39..68febbedf0 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1641,6 +1641,7 @@ sendFile(const char *readfilename, const char
*tarfilename,
                                                        "differ",
                                                        readfilename,
blkno, (int) cnt, BLCKSZ)));
                        verify_checksum = false;
+                      checksum_failures++;
                }

                if (verify_checksum)

Reply via email to