On Tue, Feb 27, 2018 at 05:15:29AM +0000, Tsunakawa, Takayuki wrote:
> That was my first thought, and I gave it up.  As you say,
> XLogReadRecord() could allocate up to 1 GB of memory for a garbage.
> That allocation can fail due to memory shortage, which prevents the
> recovery from proceeding.

Even with that, the resulting patch is sort of ugly...  So it seems to
me that the conclusion to this thread is that there is no clear winner,
and that the problem is so unlikely to happen that it is not worth the
performance impact to zero all the WAL pages fetched.  Still, the
attached has the advantage to not cause the startup process to fail
suddendly because of the allocation failure but to fail afterwards when
validating the record's contents, which has the disadvantage to allocate
temporarily up to 1GB of memory for some of the garbage, but that
allocation is short-lived.  So that would switch the failure from a
FATAL allocation failure taking down Postgres to something which will
fetching of new WAL records to be tried again.

All in all, that would be a win for the case reported on this thread.

Thoughts from anybody?
--
Michael
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 3a86f3497e..5bf50a2656 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -25,6 +25,10 @@
 #include "common/pg_lzcompress.h"
 #include "replication/origin.h"
 
+#ifndef FRONTEND
+#include "utils/memutils.h"
+#endif
+
 static bool allocate_recordbuf(XLogReaderState *state, uint32 reclength);
 
 static bool ValidXLogPageHeader(XLogReaderState *state, XLogRecPtr recptr,
@@ -309,7 +313,14 @@ XLogReadRecord(XLogReaderState *state, XLogRecPtr RecPtr, char **errormsg)
 	 * rest of the header after reading it from the next page.  The xl_tot_len
 	 * check is necessary here to ensure that we enter the "Need to reassemble
 	 * record" code path below; otherwise we might fail to apply
-	 * ValidXLogRecordHeader at all.
+	 * ValidXLogRecordHeader at all.  Note that in much unlucky circumstances,
+	 * the random data read from a recycled segment could allow the set of
+	 * sanity checks to pass.  If the garbage data read in xl_tot_len is
+	 * higher than MaxAllocSize on the backend, then the startup process
+	 * would retry to fetch fresh WAL data.  If this is lower than
+	 * MaxAllocSize, then a more-than-necessary memory allocation will
+	 * happen for a short amount of time, until the WAL reader fails at
+	 * validating the next record's data.
 	 */
 	if (targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord)
 	{
@@ -321,7 +332,11 @@ XLogReadRecord(XLogReaderState *state, XLogRecPtr RecPtr, char **errormsg)
 	else
 	{
 		/* XXX: more validation should be done here */
-		if (total_len < SizeOfXLogRecord)
+		if (total_len < SizeOfXLogRecord
+#ifndef FRONTEND
+			|| !AllocSizeIsValid(total_len)
+#endif
+		)
 		{
 			report_invalid_record(state,
 								  "invalid record length at %X/%X: wanted %u, got %u",

Attachment: signature.asc
Description: PGP signature

Reply via email to