Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

Aleksander Alekseev Fri, 19 Aug 2016 02:53:05 -0700

Heikki, Peter, thanks a lot for code review!

> What's going on here? Surely pg_atomic_init_u64() should initialize
> the value?


It's because of how pg_atomic_exchange_u64_impl is implemented:

```
while (true)
{   
    old = ptr->value; /* <-- reading of uninitialized value! */
    if (pg_atomic_compare_exchange_u64_impl(ptr, &old, xchg_))
        break;
}
```

Currently pg_atomic_init_u64 works like this:

pg_atomic_init_u64
`- pg_atomic_init_u64_impl
   `- pg_atomic_write_u64_impl
      `- pg_atomic_exchange_u64_impl

I suspect there is actually no need to make an atomic exchange during
initialization of an atomic variable. Regular `mov` should be enough
(IIRC there is no need to do `lock mov` since `mov` is already atomic).
Anyway I don't feel brave enough right now to mess with atomic
operations since it involves all sort of portability issues. So I
removed this change for now.

> Why does MemorySanitizer complain about that? Calling stat(2) is 
> supposed to fill in all the fields we look at, right?

> In addition to what Heikki wrote, I think the above is not necessary.

It's been my observation that MemorySanitizer often complains on passing
structures with uninitialized padding bytes to system calls. I'm not
sure whether libc really reads/copies structures in these cases or
MemorySanitizer doesn't like the idea in general.

Although it's true that maybe MemorySanitizer it too pedantic in these
cases, in some respect it's right. Since operating system is a black box
from our perspective (not mentioning that there are _many_ OS'es that
change all the time) in general case passing a structure with
uninitialized padding bytes to a system call can in theory cause a
non-repeatable behavior. For instance if there are caching and hash
calculation involved.

Also, as Chapman noted previously [1], according to PostgreSQL
documentation using structures with uninitialized padding bytes should
be avoided anyway.

I strongly believe that benefits of passing all MemorySanitizer checks
(possibility of discovering many complicated bugs automatically in this
case) many times outweighs drawbacks of tiny memset's overhead in these
concrete cases.

> I think this goes too far. You're zeroing all palloc'd memory, even
> if  it's going to be passed to palloc0(), and zeroed there. It might
> even  silence legitimate warnings, if there's code somewhere that
> does  palloc(), and accesses some of it before initializing. Plus
> it's a performance hit.

Completely agree, my bad. I removed these changes.

> Right after this, we have:
>     VALGRIND_MAKE_MEM_DEFINED(&msg, sizeof(msg));  
> Do we need both?

Apparently we don't. I removed VALGRIND_MAKE_MEM_DEFINED here since now
there are no uninitialized padding bytes in &msg.


Corrected patch is attached. If you have any other ideas how it could
be improved please let me know!

[1]
https://www.postgresql.org/message-id/56EFF347.20500%40anastigmatix.net

-- 
Best regards,
Aleksander Alekseev

diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c
index 01c7ef7..fb5c026 100644
--- a/src/backend/access/gist/gistxlog.c
+++ b/src/backend/access/gist/gistxlog.c
@@ -337,6 +337,7 @@ gistXLogSplit(bool page_is_leaf,
 	for (ptr = dist; ptr; ptr = ptr->next)
 		npage++;
 
+	memset(&xlrec, 0, sizeof(xlrec));
 	xlrec.origrlink = origrlink;
 	xlrec.orignsn = orignsn;
 	xlrec.origleaf = page_is_leaf;
diff --git a/src/backend/access/spgist/spgdoinsert.c b/src/backend/access/spgist/spgdoinsert.c
index f090ca5..1275b6c 100644
--- a/src/backend/access/spgist/spgdoinsert.c
+++ b/src/backend/access/spgist/spgdoinsert.c
@@ -204,6 +204,7 @@ addLeafTuple(Relation index, SpGistState *state, SpGistLeafTuple leafTuple,
 {
 	spgxlogAddLeaf xlrec;
 
+	memset(&xlrec, 0, sizeof(xlrec));
 	xlrec.newPage = isNew;
 	xlrec.storesNulls = isNulls;
 
@@ -448,6 +449,8 @@ moveLeafs(Relation index, SpGistState *state,
 		i = it->nextOffset;
 	}
 
+	memset(&xlrec, 0, sizeof(xlrec));
+
 	/* Find a leaf page that will hold them */
 	nbuf = SpGistGetBuffer(index, GBUF_LEAF | (isNulls ? GBUF_NULLS : 0),
 						   size, &xlrec.newPage);
@@ -723,6 +726,7 @@ doPickSplit(Relation index, SpGistState *state,
 	newLeafs = (SpGistLeafTuple *) palloc(sizeof(SpGistLeafTuple) * n);
 	leafPageSelect = (uint8 *) palloc(sizeof(uint8) * n);
 
+	memset(&xlrec, 0, sizeof(xlrec));
 	STORE_STATE(state, xlrec.stateSrc);
 
 	/*
@@ -1501,6 +1505,7 @@ spgAddNodeAction(Relation index, SpGistState *state,
 	newInnerTuple = addNode(state, innerTuple, nodeLabel, nodeN);
 
 	/* Prepare WAL record */
+	memset(&xlrec, 0, sizeof(xlrec));
 	STORE_STATE(state, xlrec.stateSrc);
 	xlrec.offnum = current->offnum;
 
@@ -1741,6 +1746,7 @@ spgSplitNodeAction(Relation index, SpGistState *state,
 	postfixTuple->allTheSame = innerTuple->allTheSame;
 
 	/* prep data for WAL record */
+	memset(&xlrec, 0, sizeof(xlrec));
 	xlrec.newPage = false;
 
 	/*
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f13f9c1..360e8d2 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4799,15 +4799,14 @@ BootStrapXLOG(void)
 	 * segment with logid=0 logseg=1. The very first WAL segment, 0/0, is not
 	 * used, so that we can use 0/0 to mean "before any valid WAL segment".
 	 */
+	memset(&checkPoint, 0, sizeof(checkPoint));
 	checkPoint.redo = XLogSegSize + SizeOfXLogLongPHD;
 	checkPoint.ThisTimeLineID = ThisTimeLineID;
 	checkPoint.PrevTimeLineID = ThisTimeLineID;
 	checkPoint.fullPageWrites = fullPageWrites;
-	checkPoint.nextXidEpoch = 0;
 	checkPoint.nextXid = FirstNormalTransactionId;
 	checkPoint.nextOid = FirstBootstrapObjectId;
 	checkPoint.nextMulti = FirstMultiXactId;
-	checkPoint.nextMultiOffset = 0;
 	checkPoint.oldestXid = FirstNormalTransactionId;
 	checkPoint.oldestXidDB = TemplateDbOid;
 	checkPoint.oldestMulti = FirstMultiXactId;
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index 7902d43..9aeb3f0 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -669,6 +669,11 @@ destroy_tablespace_directories(Oid tablespaceoid, bool redo)
 	char	   *subfile;
 	struct stat st;
 
+	/*
+	 * Prevents MemorySanitizer's "use-of-uninitialized-value" warning.
+	 */
+	memset(&st, 0, sizeof(st));
+
 	linkloc_with_version_dir = psprintf("pg_tblspc/%u/%s", tablespaceoid,
 										TABLESPACE_VERSION_DIRECTORY);
 
diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c
index 1b4bbce..02121ff 100644
--- a/src/backend/libpq/hba.c
+++ b/src/backend/libpq/hba.c
@@ -1008,14 +1008,9 @@ parse_hba_line(List *line, int line_num, char *raw_line)
 				*cidr_slash = '\0';
 
 			/* Get the IP address either way */
+			memset(&hints, 0, sizeof(hints));
 			hints.ai_flags = AI_NUMERICHOST;
 			hints.ai_family = AF_UNSPEC;
-			hints.ai_socktype = 0;
-			hints.ai_protocol = 0;
-			hints.ai_addrlen = 0;
-			hints.ai_canonname = NULL;
-			hints.ai_addr = NULL;
-			hints.ai_next = NULL;
 
 			ret = pg_getaddrinfo_all(str, NULL, &hints, &gai_result);
 			if (ret == 0 && gai_result)
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 8fa9edb..b82ab92 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -337,14 +337,10 @@ pgstat_init(void)
 	/*
 	 * Create the UDP socket for sending and receiving statistic messages
 	 */
+	memset(&hints, 0, sizeof(hints));
 	hints.ai_flags = AI_PASSIVE;
 	hints.ai_family = AF_UNSPEC;
 	hints.ai_socktype = SOCK_DGRAM;
-	hints.ai_protocol = 0;
-	hints.ai_addrlen = 0;
-	hints.ai_addr = NULL;
-	hints.ai_canonname = NULL;
-	hints.ai_next = NULL;
 	ret = pg_getaddrinfo_all("localhost", NULL, &hints, &addrs);
 	if (ret || !addrs)
 	{
diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
index 5803518..a06e5b7 100644
--- a/src/backend/utils/cache/inval.c
+++ b/src/backend/utils/cache/inval.c
@@ -330,21 +330,11 @@ AddCatcacheInvalidationMessage(InvalidationListHeader *hdr,
 	SharedInvalidationMessage msg;
 
 	Assert(id < CHAR_MAX);
+	memset(&msg, 0, sizeof(msg));
 	msg.cc.id = (int8) id;
 	msg.cc.dbId = dbId;
 	msg.cc.hashValue = hashValue;
 
-	/*
-	 * Define padding bytes in SharedInvalidationMessage structs to be
-	 * defined. Otherwise the sinvaladt.c ringbuffer, which is accessed by
-	 * multiple processes, will cause spurious valgrind warnings about
-	 * undefined memory being used. That's because valgrind remembers the
-	 * undefined bytes from the last local process's store, not realizing that
-	 * another process has written since, filling the previously uninitialized
-	 * bytes
-	 */
-	VALGRIND_MAKE_MEM_DEFINED(&msg, sizeof(msg));
-
 	AddInvalidationMessage(&hdr->cclist, &msg);
 }
 
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index a978bbc..8890926 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -1363,14 +1363,9 @@ setup_config(void)
 #endif
 
 		/* for best results, this code should match parse_hba() */
+		memset(&hints, 0, sizeof(hints));
 		hints.ai_flags = AI_NUMERICHOST;
 		hints.ai_family = AF_UNSPEC;
-		hints.ai_socktype = 0;
-		hints.ai_protocol = 0;
-		hints.ai_addrlen = 0;
-		hints.ai_canonname = NULL;
-		hints.ai_addr = NULL;
-		hints.ai_next = NULL;
 
 		if (err != 0 ||
 			getaddrinfo("::1", NULL, &hints, &gai_result) != 0)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

Reply via email to