On Wed, Feb 21, 2018 at 10:55 AM, Peter Geoghegan <p...@bowt.ie> wrote:
> Sure, but it looks like it has the exact same underlying cause to the
> LogicalTapeFreeze() issue. It shouldn't be very hard to write an
> equivalent patch for LogicalTapeRewindForRead() -- I pointed out that
> this could happen there instead before the first patch went in, in
> fact. My mistake was to imagine that that could never happen during
> the regression tests.

Attached patch does this. I cannot recreate this issue locally, but
this should still fix it on skink.

> Should we even be doing a parallel external sort here? It's hardly
> part of an intentional effort to test the code. I had to push to get
> us to give external sorts test coverage at one point about 18 months
> ago, because of concerns about the overhead/duration of external
> sorts. Not that I feel strongly about this myself.

I suppose that we should commit this patch, even if we subsequently
suppress parallel CREATE INDEX in the multiple-row-versions isolation
test.

-- 
Peter Geoghegan
From 9e3ead04bf86a3e0fd9e16a7c823e2f7335337ff Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <p...@bowt.ie>
Date: Wed, 21 Feb 2018 11:57:56 -0800
Subject: [PATCH] Avoid another Valgrind complaint about a logtape.c write().

LogicalTapeRewindForRead() may write out its first block when it is
dirty but not full, just like LogicalTapeFreeze().  To avoid Valgrind
complaints, tell it to treat the tail of logtape.c's buffer as defined
within LogicalTapeRewindForRead().  This is a mechanical duplication of
the code from commit 9fafa413ac602624e10f61ef44a20c17029d43d8.

Valgrind complained inconsistently when isolation tests are run because
the precise scheduling of worker processes affects whether or not
uninitialized/poisoned bytes can be flushed in the
LogicalTapeRewindForRead() path.  There has to be enough tuples for a
worker to need to spill and merge multiple runs, and yet not so many
that even a BLCKSZ buffer is filled.  This is just barely possible.

Per buildfarm member skink.  It's not completely clear that this will
fix the issue, since it's hard to reproduce, but let's see what the
buildfarm thinks of it.

Peter Geoghegan
---
 src/backend/utils/sort/logtape.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/src/backend/utils/sort/logtape.c b/src/backend/utils/sort/logtape.c
index d6794bf..05dde63 100644
--- a/src/backend/utils/sort/logtape.c
+++ b/src/backend/utils/sort/logtape.c
@@ -739,6 +739,18 @@ LogicalTapeRewindForRead(LogicalTapeSet *lts, int tapenum, size_t buffer_size)
 		 */
 		if (lt->dirty)
 		{
+			/*
+			 * As long as we've filled the buffer at least once, its contents
+			 * are entirely defined from valgrind's point of view, even though
+			 * contents beyond the current end point may be stale.  But it's
+			 * possible - at least in the case of a parallel sort - to sort
+			 * such small amount of data that we do not fill the buffer even
+			 * once.  Tell valgrind that its contents are defined, so it
+			 * doesn't bleat.
+			 */
+			VALGRIND_MAKE_MEM_DEFINED(lt->buffer + lt->nbytes,
+									  lt->buffer_size - lt->nbytes);
+
 			TapeBlockSetNBytes(lt->buffer, lt->nbytes);
 			ltsWriteBlock(lts, lt->curBlockNumber, (void *) lt->buffer);
 		}
-- 
2.7.4

Reply via email to