Hello, we happened to see server crash on archive recovery under
some condition.

After TLI was incremented, there should be the case that the WAL
file for older timeline is archived but not for that of the same
segment id but for newer timeline. Archive recovery should fail
for the case with PANIC error like follows,

| PANIC: record with zero length at 0/1820D40

Replay script is attached. This issue occured for 9.4dev, 9.3.2,
and not for 9.2.6 and 9.1.11. The latter search pg_xlog for the
TLI before trying archive for older TLIs.

This occurrs during fetching checkpoint redo record in archive
recovery.

> if (checkPoint.redo < RecPtr)
> {
>       /* back up to find the record */
>       record = ReadRecord(xlogreader, checkPoint.redo, PANIC, false);

And this is caused by that the segment file for older timeline in
archive directory is preferred to that for newer timeline in
pg_xlog.

Looking into pg_xlog before trying the older TLIs in archive like
9.2- fixes this issue. The attached patch is one possible
solution for 9.4dev.

Attached files are,

 - recvtest.sh: Replay script. Step 1 and 2 makes the condition
   and step 3 causes the issue.

 - archrecvfix_20131212.patch: The patch fixes the issue. Archive
   recovery reads pg_xlog before trying older TLI in archive
   similarly to 9.1- by this patch.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
#/bin/bash

ROOT=`pwd`
PGDATA=$ROOT/test/data
ARCHDIR=$ROOT/test/arc

if [ ! -d $ARCHDIR -o ! -d $PGDATA ]; then
    echo "$PGDATA and/or $ARCHDIR not found"
    exit
fi

echo "### EMPTY ARCHIVE DIRECTORY ###"
if [ -d $ARCHDIR ]; then rm -f $ARCHDIR/*; fi

echo "### EMPTY PGDATA DIRECTORY ###"
if [ -d $PGDATA ]; then rm -r $PGDATA/*; fi

echo "### DO INITDB ###"
initdb -D $PGDATA > /dev/null

echo "### set up postgresql.conf ###"
cat >> $PGDATA/postgresql.conf <<EOF
wal_level = archive
archive_mode = on
archive_command = '/bin/cp %p $ARCHDIR/%f'
# log_min_messages = debug5
EOF

echo "### STAGE 1/3 -- PUT XLOG ..001...001 AND ..002.HISTORY INTO ARCHIVE ###"
echo "### STAGE 1/3: 1/2 START SERVER ###"
pg_ctl start -D $PGDATA -w

echo "### STAGE 1/3: 2/2 STOP SERVER ###"
pg_ctl stop -D $PGDATA

echo "### STAGE 2/3 -- PUT XLOG ..002...001 INTO ONLY pg_xlog ###"
echo "### STAGE 2/3: 1/3 PREPARE recovery.conf ###"
cat > $PGDATA/recovery.conf <<EOF
restore_command = '/bin/cp $ARCHDIR/%f %p'
EOF

echo "### STAGE 2/3: 2/3 START SERVER IN ARCHIVE RECOVERY MODE ###"
pg_ctl start -D $PGDATA -w

echo "### STAGE 2/3: 3/3 STOP SERVER IMMEDIATELY ###"
pg_ctl stop -m i -D $PGDATA

echo "### ls $ARCHDIR"
ls $ARCHDIR

echo "### ls $PGDATA/pg_xlog"
ls $PGDATA/pg_xlog


echo "### STAGE 3/3 - START SERVER IN ARCHIVE RECOVERY MODE AGAIN ###"
echo "### STAGE 3/3: 1/2 RESTORE recovery.conf ###"
mv $PGDATA/recovery.done $PGDATA/recovery.conf

echo "### STAGE 3/3: 2/2 START SERVER IN ARCHIVE RECOVERY MODE 2ND RUN ###"
pg_ctl start -D $PGDATA -w -t 2
if [ $? -ne 0 ]; then
  echo  "### SERVER CRASHED ###"
  exit
fi
echo "### SERVER SEEMS SUCCESSFULLY UP. STOP IT. ###"
pg_ctl stop -D $PGDATA -w
echo  "### SERVER DID NOT CRASH ###"
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 6fa5479..75be478 100755
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -10935,10 +10935,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 					curFileTLI = 0;
 
 				/*
-				 * Try to restore the file from archive, or read an existing
-				 * file from pg_xlog.
+				 * When XLOG_FROM_ARCHIVE, read xlog file with largest TLI
+				 * preferring archive to pg_xlog. Or when XLOG_FROM_PG_XLOG,
+				 * search only pg_xlog.
 				 */
-				readFile = XLogFileReadAnyTLI(readSegNo, DEBUG2, currentSource);
+				readFile = XLogFileReadAnyTLI(readSegNo, DEBUG2,
+									currentSource == XLOG_FROM_ARCHIVE ?
+									XLOG_FROM_ANY : currentSource);
 				if (readFile >= 0)
 					return true;	/* success! */
 
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to