This bug seems to have snuck in there with the introduction of walmethods. AFAICT we are testing the result of sync() backwards, so whenever a partial segment exists for pg_receivewal, it will fail. It will then unlink the file, so when it retries 5 seconds later it works.
It also doesn't log the failure. Oops. Attached patch reverses the check, and adds a failure message. I'd appreciate a quick review in case I have the logic backwards in my head... -- Magnus Hagander Me: https://www.hagander.net/ <http://www.hagander.net/> Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c index f415135..8511e57 100644 --- a/src/bin/pg_basebackup/receivelog.c +++ b/src/bin/pg_basebackup/receivelog.c @@ -132,8 +132,11 @@ open_walfile(StreamCtl *stream, XLogRecPtr startpoint) } /* fsync file in case of a previous crash */ - if (!stream->walmethod->sync(f)) + if (stream->walmethod->sync(f) != 0) { + fprintf(stderr, + _("%s: could not sync existing transaction log file \"%s\": %s\n"), + progname, fn, stream->walmethod->getlasterror()); stream->walmethod->close(f, CLOSE_UNLINK); return false; }
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers