Re: pg_waldump: support decoding of WAL inside tarfile

Amul Sul Fri, 20 Mar 2026 23:20:25 -0700

On Sat, Mar 21, 2026 at 9:19 AM Tom Lane <[email protected]> wrote:
>
> Andrew Dunstan <[email protected]> writes:
> > Thanks, committed with very minor tweaks.
>
> Buildfarm members batta and hachi don't like this very much.
> They fail the pg_verifybackup tests like so:
>
> # Running: pg_verifybackup --exit-on-error 
> /home/admin/batta/buildroot/HEAD/pgsql.build/src/bin/pg_verifybackup/tmp_check/t_008_untar_primary_data/backup/server-backup
> pg_waldump: error: could not find WAL in archive "base.tar.zst"
> pg_verifybackup: error: WAL parsing failed for timeline 1
>
> Only the zstd-compression case fails.  I've spent several hours trying
> to reproduce this, without any luck, although I can get a similar
> failure in only the gzip case if I build with --with-wal-blocksize=64.
> I do not have an explanation for the seeming cross-platform
> difference.  However after adding a lot of debug tracing, I believe
> I see the bug, or at least a related bug.  This bit in
> archive_waldump.c's init_archive_reader is where the error comes from:
>
>     /*
>      * Read until we have at least one full WAL page (XLOG_BLCKSZ bytes) from
>      * the first WAL segment in the archive so we can extract the WAL segment
>      * size from the long page header.
>      */
>     while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
>     {
>         if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
>             pg_fatal("could not find WAL in archive \"%s\"",
>                      privateInfo->archive_name);
>
>         entry = privateInfo->cur_file;
>     }
>
> That looks plausible but is in fact utterly broken when there's not a
> lot of WAL data in the archive, as there is not in this test case.
> There are at least two problems:
>


Thanks for the detailed debugging. I noticed the failure this morning
and had started investigating the issue, but in the meantime, I got
your helpful reply, which saved me a bunch of time and energy.

> 1. read_archive_file reads some data from the source WAL archive and
> shoves it into the astreamer decompression pipeline.  However, once it
> runs out of source data, it just returns zero and we fail immediately.
> This does not account for the possibility --- nay, certainty --- that
> there is data queued inside the decompression pipeline.  So this
> doesn't work if the data we need has been compressed into less than
> XLOG_BLCKSZ worth of compressed data.  (I suppose that the seeming
> cross-platform differences have to do with the effectiveness of the
> compression algorithm, but I don't really understand why it'd not be
> the same everywhere.)  We need to do astreamer_finalize once we run
> out of source data.  I think the cleanest place to handle that would
> be inside read_archive_file, but its return convention will need some
> rework if we want to put it there (because rc == 0 shouldn't cause an
> immediate failure if we were able to finalize some more data).  As an
> ugly experiment I put an astreamer_finalize call into the rc == 0 path
> of the above loop, but it still didn't work, because:
>
> 2. If the decompression pipeline reaches the end of the WAL file that
> we want, the ASTREAMER_MEMBER_TRAILER case in
> astreamer_waldump_content instantly resets privateInfo->cur_file to
> NULL.  Then the loop in init_archive_reader cannot exit successfully,
> and it will just read till the end of the archive and fail.
>
> I see that of the three callers of read_archive_file, only
> get_archive_wal_entry is aware of this possibility; but
> init_archive_reader certainly needs to deal with it and I bet
> read_archive_wal_page does too.  Moreover, get_archive_wal_entry's
> solution looks to me like a fragile kluge that probably doesn't work
> reliably either, the reason being that privateInfo->cur_file can
> change multiple times during a single call to read_archive_file,
> if the WAL data has been compressed sufficiently.  That whole API
> seems to need some rethinking, not to mention better documentation
> than the zero it has now.
>

I agree; init_archive_reader needs that handling, but
read_archive_wal_page doesn't need any fix. Since it only deals with
the current entry and already holds a reference to it, there is no
need to fetch it from the hash table again.

init_archive_reader has to scan the hash table because it doesn't
already have the specific WAL filename it is looking for, unlike
get_archive_wal_entry. Please have a look at the attached patch, which
tries to fix that.

> While I'm bitching: this error message "could not find WAL in archive
> \"%s\"" seems to me to be completely misleading and off-point.
>

I tried to improve that in the attached version.

regards,
Amul

From bde3fb4e3125eed740b5d949a990b4e06d01499a Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Sat, 21 Mar 2026 11:22:50 +0530
Subject: [PATCH] pg_waldump: Handle archive exhaustion in
 init_archive_reader().

When read_archive_file() returns 0, the archive may have already
buffered a complete WAL file into the hash table before exhausting
the input.  Instead of immediately reporting an error, search the
hash table for an entry containing at least sizeof(XLogLongPageHeader)
bytes.  Report a specific error if a WAL entry exists but is too
short (truncated/corrupt), or a generic error if no WAL was found
at all.

Also tighten the loop condition to check for sizeof(XLogLongPageHeader)
rather than XLOG_BLCKSZ, since only the long page header is needed
at this stage.
---
 src/bin/pg_waldump/archive_waldump.c | 51 +++++++++++++++++++++++++---
 1 file changed, 47 insertions(+), 4 deletions(-)

diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index b078c2d6960..5bd1faf3d95 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -176,13 +176,56 @@ init_archive_reader(XLogDumpPrivate *privateInfo,
 	 * the first WAL segment in the archive so we can extract the WAL segment
 	 * size from the long page header.
 	 */
-	while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
+	while (entry == NULL || entry->read_len < sizeof(XLogLongPageHeader))
 	{
 		if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
-			pg_fatal("could not find WAL in archive \"%s\"",
-					 privateInfo->archive_name);
+		{
+			ArchivedWAL_iterator iter;
+			ArchivedWALFile *e = NULL;
 
-		entry = privateInfo->cur_file;
+			entry = NULL;
+
+			/*
+			 * read_archive_file() returned 0, meaning the archive is
+			 * exhausted.  However, a sufficiently compressed archive may have
+			 * already read a complete WAL file and inserted it into the hash
+			 * table before returning.  Search the hash table for any entry
+			 * that already has enough buffered data to contain the long page
+			 * header; if none is found, the archive contains no usable WAL.
+			 */
+			ArchivedWAL_start_iterate(privateInfo->archive_wal_htab, &iter);
+			while ((e = ArchivedWAL_iterate(privateInfo->archive_wal_htab,
+											&iter)) != NULL)
+			{
+				if (e->read_len >= sizeof(XLogLongPageHeader))
+				{
+					entry = e;
+					break;
+				}
+			}
+
+			if (entry == NULL)
+			{
+				/*
+				 * A WAL file was found in the hash table but it does not
+				 * contain enough data to read the long page header,
+				 * indicating a truncated or corrupt WAL segment.
+				 */
+				if (e != NULL)
+					pg_fatal("could not read file \"%s\" from \"%s\" archive: read %d of %d",
+							 e->fname, privateInfo->archive_name, e->read_len,
+							 (int) sizeof(XLogLongPageHeader));
+
+				/*
+				 * The hash table contains no WAL entries at all, meaning the
+				 * archive holds no WAL data.
+				 */
+				pg_fatal("could not find WAL in archive \"%s\"",
+						 privateInfo->archive_name);
+			}
+		}
+		else
+			entry = privateInfo->cur_file;
 	}
 
 	/* Extract the WAL segment size from the long page header */
-- 
2.47.1

Re: pg_waldump: support decoding of WAL inside tarfile

Reply via email to