Hi hackers, While developing my extension, I faced some problems in using pg_basebackup/pg_verifybackup on a cluster with additional custom files saved. The extension saves its own format files, which may have a variety of sizes and usually quite low compression potential.
The same situation can be reproduced using randomly filled files. Let’s imagine the extension saves its own statistics in pg_stat: dd if=/dev/urandom of=$PGDATA/pg_stat/junk bs=1M count=16 Now we’ll run the server and make an lz4-packed backup: $PGINSTALL/bin/pg_basebackup --no-sync -cfast --target server:/tmp/server-backup -Xfetch --compress server-lz4 Now we’ll check the backup: $PGINSTALL/bin/pg_verifybackup -n /tmp/server-backup The verification process will fail. The exact error may vary, but the most common version looks like this: pg_verifybackup: error: checksum mismatch for file "pg_stat/junk" in archive "base.tar.lz4" pg_verifybackup: error: "base/4/3431" is present in the manifest but not on disk pg_verifybackup: error: "base/5/2620" is present in the manifest but not on disk pg_verifybackup: error: "base/1/2617" is present in the manifest but not on disk pg_verifybackup: error: "base/4/3576" is present in the manifest but not on disk ...... Using gzip or zstd compressions in the same setup doesn’t lead to any problems. Moreover, the lz4 archive itself is fine and it can be successfully unpacked via cli lz4 unpacker. The problem thus is in reading, and some investigation done by me and my colleagues lead us to astreamer_lz4.c file. Looks like astreamer_lz4_decompressor_content() function is initializing _out-parameters incorrectly: next_out = (uint8 *) mystreamer->base.bbs_buffer.data; avail_out = mystreamer->base.bbs_buffer.maxlen; , while should be: next_out = (uint8 *) mystreamer->base.bbs_buffer.data + mystreamer->bytes_written; avail_out = mystreamer->base.bbs_buffer.maxlen - mystreamer->bytes_written; And this would be consistent with what the gzip version of the function (astreamer_gzip_decompressor_content()) does. In the vast majority of cases the mystreamer->bytes_written value is zero at the moment of next_out/avail_out initialization, thus the problem does not appear. In fact we were totally unable to reproduce non-zero initial value of mystreamer->bytes_written without poor-compressible custom format files.I’m not that deep in the compression algorithms to say exactly why is that, and I realize that very few users will really face the problem, but the problem is definitely there. Attached is a simple patch implementing the above fix. The issue disappears with it. What do you think? -- best regards, Mikhail A. Gribkov
v1-Fix-lz4-decompressor.patch
Description: Binary data