Re: [HACKERS] Recursive ReceiveSharedInvalidMessages not safe

2014-05-08 Thread Andres Freund
On 2014-05-05 15:41:22 -0400, Tom Lane wrote:
  After far, far too much confused head scratching, code reading, random
  elog()s et al I found out that this is just because of a deficiency in
  valgrind's undefinedness tracking. [...]
  Unfortunately this cannot precisely be caught by valgrind's
  suppressions. Thus I'd like to add memset(SharedInvalidationMessage msg,
  0) in AddCatcacheInvalidationMessage() et al. to suppress these
  warnings. Imo we can just add them unconditionally, but if somebody else
  prefers we can add #ifdef USE_VALGRIND around them.
 
 I'd be okay with USE_VALGRIND.  I'm not particularly hot on adding a
 memset for everybody just to make valgrind less confused.  Especially
 since that's really going to hide any problems, not fix them.

The attached patch does VALGRIND_MAKE_MEM_DEFINED() on the relevant
structs. That's already defined to empty if USE_VALGRIND isn't defined.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 4da7b0caa2058c8b07ee4a7ab529b9ca0c292bcf Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Wed, 7 May 2014 22:30:05 +0200
Subject: [PATCH] Silence a spurious valgrind warning in inval.c.

For valgrind's sake explicitly define all bytes of
SharedInvalidationMessage as defined before sending them. Otherwise
valgrind sometimes treats bytes as undefined that aren't because
valgrind doesn't understand that the ringbuffer in sinvaladt.c is
accesses by multiple processes. Valgrind remembers that it stored a
undefined byte into parts of a slot in the ringbuffer and warns when
it uses that byte again - not realizing it has been stored by another
process.
---
 src/backend/utils/cache/inval.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
index 5971469..dd46e18 100644
--- a/src/backend/utils/cache/inval.c
+++ b/src/backend/utils/cache/inval.c
@@ -103,6 +103,7 @@
 #include storage/smgr.h
 #include utils/catcache.h
 #include utils/inval.h
+#include utils/memdebug.h
 #include utils/memutils.h
 #include utils/rel.h
 #include utils/relmapper.h
@@ -332,6 +333,14 @@ AddCatcacheInvalidationMessage(InvalidationListHeader *hdr,
 	msg.cc.id = (int8) id;
 	msg.cc.dbId = dbId;
 	msg.cc.hashValue = hashValue;
+	/*
+	 * Define padding bytes to be defined, otherwise the sinvaladt.c
+	 * ringbuffer, which is accessed by multiple processes, will cause false
+	 * positives because valgrind remembers the undefined bytes from this
+	 * processes store, not realizing that another process has written since.
+	 */
+	VALGRIND_MAKE_MEM_DEFINED(msg, sizeof(msg));
+
 	AddInvalidationMessage(hdr-cclist, msg);
 }
 
@@ -347,6 +356,9 @@ AddCatalogInvalidationMessage(InvalidationListHeader *hdr,
 	msg.cat.id = SHAREDINVALCATALOG_ID;
 	msg.cat.dbId = dbId;
 	msg.cat.catId = catId;
+	/* check AddCatcacheInvalidationMessage() for an explanation */
+	VALGRIND_MAKE_MEM_DEFINED(msg, sizeof(msg));
+
 	AddInvalidationMessage(hdr-cclist, msg);
 }
 
@@ -370,6 +382,9 @@ AddRelcacheInvalidationMessage(InvalidationListHeader *hdr,
 	msg.rc.id = SHAREDINVALRELCACHE_ID;
 	msg.rc.dbId = dbId;
 	msg.rc.relId = relId;
+	/* check AddCatcacheInvalidationMessage() for an explanation */
+	VALGRIND_MAKE_MEM_DEFINED(msg, sizeof(msg));
+
 	AddInvalidationMessage(hdr-rclist, msg);
 }
 
@@ -393,6 +408,9 @@ AddSnapshotInvalidationMessage(InvalidationListHeader *hdr,
 	msg.sn.id = SHAREDINVALSNAPSHOT_ID;
 	msg.sn.dbId = dbId;
 	msg.sn.relId = relId;
+	/* check AddCatcacheInvalidationMessage() for an explanation */
+	VALGRIND_MAKE_MEM_DEFINED(msg, sizeof(msg));
+
 	AddInvalidationMessage(hdr-rclist, msg);
 }
 
@@ -1242,6 +1260,9 @@ CacheInvalidateSmgr(RelFileNodeBackend rnode)
 	msg.sm.backend_hi = rnode.backend  16;
 	msg.sm.backend_lo = rnode.backend  0x;
 	msg.sm.rnode = rnode.node;
+	/* check AddCatcacheInvalidationMessage() for an explanation */
+	VALGRIND_MAKE_MEM_DEFINED(msg, sizeof(msg));
+
 	SendSharedInvalidMessages(msg, 1);
 }
 
@@ -1267,6 +1288,9 @@ CacheInvalidateRelmap(Oid databaseId)
 
 	msg.rm.id = SHAREDINVALRELMAP_ID;
 	msg.rm.dbId = databaseId;
+	/* check AddCatcacheInvalidationMessage() for an explanation */
+	VALGRIND_MAKE_MEM_DEFINED(msg, sizeof(msg));
+
 	SendSharedInvalidMessages(msg, 1);
 }
 
-- 
1.8.5.rc2.dirty


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Recursive ReceiveSharedInvalidMessages not safe

2014-05-08 Thread Andres Freund
On 2014-05-05 18:50:59 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-05-05 15:41:22 -0400, Tom Lane wrote:
  Looks all right to me.  Yeah, the right shift might have undefined
  high-order bits, but we don't care because we're storing the result
  into an int16.
 
  Doesn't at the very least
  rnode.backend = (msg-sm.backend_hi  16) | (int) 
  msg-sm.backend_lo;
  have to be
  rnode.backend = ((int) msg-sm.backend_hi  16) | (int) 
  msg-sm.backend_lo;
 
 A promotion to int (or wider) is implicit in any arithmetic operation,
 last I checked the C standard.  The (int) on the other side isn't
 necessary either.

Done in the attached patch.

 We might actually be better off casting both sides to unsigned int,
 just to enforce that the left shifting is done with unsigned semantics.
 
  c) The ProcessMessageList() calls access msg-rc.id to test for the type
  of the existing message. That's not nice.
 
  Huh?
 
  The checks should access msg-id, not msg-rc.id...
 
 Meh.  If they're not the same thing, we've got big trouble anyway.
 But if you want to change it as a cosmetic thing, no objection.

Changed as well.

On 2014-05-05 15:41:22 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  a) SICleanupQueue() sometimes releases and reacquires the write lock
 held on the outside. That's pretty damn fragile, not to mention
 ugly. Even slight reformulations of the code in SIInsertDataEntries()
 can break this... Can we please extend the comment in the latter that
 to mention the lock dropping explicitly?
 
 Want to write a better comment?

Edited the comment and, in a perhaps debatable fashion, moved some
variable declarations + volatile into the retry loop for some added
cozyness. If the compiler inlines the cleanup function it could very
well decide to fetch some of the variables only once.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From a24973508710c965035fca45e933823ce49a5a4f Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Thu, 8 May 2014 16:13:44 +0200
Subject: [PATCH] Minor cleanups and comment improvements in the cache
 invalidation code.

---
 src/backend/storage/ipc/sinvaladt.c | 11 ---
 src/backend/utils/cache/inval.c |  9 +
 2 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index 0328660..3966ccc 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -453,7 +453,6 @@ SIInsertDataEntries(const SharedInvalidationMessage *data, int n)
 	while (n  0)
 	{
 		int			nthistime = Min(n, WRITE_QUANTUM);
-		int			numMsgs;
 		int			max;
 		int			i;
 
@@ -466,11 +465,17 @@ SIInsertDataEntries(const SharedInvalidationMessage *data, int n)
 		 * queue and reset anyone who is preventing space from being freed.
 		 * Otherwise, clean the queue only when it's exceeded the next
 		 * fullness threshold.  We have to loop and recheck the buffer state
-		 * after any call of SICleanupQueue.
+		 * after any call of SICleanupQueue because the cleanup sometimes
+		 * require releasing the write lock acquired above.
 		 */
 		for (;;)
 		{
-			numMsgs = segP-maxMsgNum - segP-minMsgNum;
+			int numMsgs;
+			/* use volatile pointer to prevent code rearrangement */
+			volatile SISeg *vsegP = segP;
+
+			numMsgs = vsegP-maxMsgNum - vsegP-minMsgNum;
+
 			if (numMsgs + nthistime  MAXNUMMESSAGES ||
 numMsgs = segP-nextThreshold)
 SICleanupQueue(true, nthistime);
diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
index dd46e18..4c3197b 100644
--- a/src/backend/utils/cache/inval.c
+++ b/src/backend/utils/cache/inval.c
@@ -374,7 +374,7 @@ AddRelcacheInvalidationMessage(InvalidationListHeader *hdr,
 	/* Don't add a duplicate item */
 	/* We assume dbId need not be checked because it will never change */
 	ProcessMessageList(hdr-rclist,
-	   if (msg-rc.id == SHAREDINVALRELCACHE_ID 
+	   if (msg-id == SHAREDINVALRELCACHE_ID 
 		   msg-rc.relId == relId)
 	   return);
 
@@ -400,7 +400,7 @@ AddSnapshotInvalidationMessage(InvalidationListHeader *hdr,
 	/* Don't add a duplicate item */
 	/* We assume dbId need not be checked because it will never change */
 	ProcessMessageList(hdr-rclist,
-	   if (msg-sn.id == SHAREDINVALSNAPSHOT_ID 
+	   if (msg-id == SHAREDINVALSNAPSHOT_ID 
 		   msg-sn.relId == relId)
 	   return);
 
@@ -580,7 +580,8 @@ LocalExecuteInvalidationMessage(SharedInvalidationMessage *msg)
 		RelFileNodeBackend rnode;
 
 		rnode.node = msg-sm.rnode;
-		rnode.backend = (msg-sm.backend_hi  16) | (int) msg-sm.backend_lo;
+		rnode.backend = (BackendId) (((uint32) msg-sm.backend_hi  16)
+	 |(uint32) msg-sm.backend_lo);
 		smgrclosenode(rnode);
 	}
 	else if (msg-id == SHAREDINVALRELMAP_ID)
@@ -1257,7 

[HACKERS] Recursive ReceiveSharedInvalidMessages not safe

2014-05-05 Thread Andres Freund
Hi,

While investigating an issue pointed out by valgrind around undefined
bytes in inval.c SHAREDINVALSMGR_ID processing I noticed that there's a
bug in ReceiveSharedInvalidMessages(). It tries to be safe against
recursion but it's not:
When it recurses into ReceiveSharedInvalidMessages() from it's main loop
from inside the inval callback while nextmsg = nummsgs it'll overwrite
the 'messages' array with new contents. But at this point the old
content of one entry in the messages array is still passed to
the LocalExecuteInvalidationMessage() that caused the recursion.

It looks to me like SHAREDINVALRELCACHE_ID messages are the only ones
actually affected by this in reality because all the other ones either
operate on stack memory or don't recurse. What could happen there is
that a different relcache entry is invalidated than the relcache
invalidation callback is called for.
This actually could explain the relfilenodemap error we've seen a couple
weeks back.

It looks to me like this is broken since at least fad153ec. I think the
fix is just to make the current 'SharedInvalidationMessage *msg' not be
pointers but a local copiy of the to-be-processed entry.

Comments?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Recursive ReceiveSharedInvalidMessages not safe

2014-05-05 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 While investigating an issue pointed out by valgrind around undefined
 bytes in inval.c SHAREDINVALSMGR_ID processing I noticed that there's a
 bug in ReceiveSharedInvalidMessages(). It tries to be safe against
 recursion but it's not:
 When it recurses into ReceiveSharedInvalidMessages() from it's main loop
 from inside the inval callback while nextmsg = nummsgs it'll overwrite
 the 'messages' array with new contents. But at this point the old
 content of one entry in the messages array is still passed to
 the LocalExecuteInvalidationMessage() that caused the recursion.

Hm, yeah, so if the called inval function continues to use the message
contents after doing something that could result in a recursive call,
it might be looking at trashed data.

 It looks to me like this is broken since at least fad153ec. I think the
 fix is just to make the current 'SharedInvalidationMessage *msg' not be
 pointers but a local copiy of the to-be-processed entry.

Yeah, that should do it.  I think I'd been trying to avoid copying
messages more times than necessary, but evidently I optimized away
one copy step too many :-(

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Recursive ReceiveSharedInvalidMessages not safe

2014-05-05 Thread Andres Freund
On 2014-05-05 14:15:58 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  While investigating an issue pointed out by valgrind around undefined
  bytes in inval.c SHAREDINVALSMGR_ID processing I noticed that there's a
  bug in ReceiveSharedInvalidMessages(). It tries to be safe against
  recursion but it's not:
  When it recurses into ReceiveSharedInvalidMessages() from it's main loop
  from inside the inval callback while nextmsg = nummsgs it'll overwrite
  the 'messages' array with new contents. But at this point the old
  content of one entry in the messages array is still passed to
  the LocalExecuteInvalidationMessage() that caused the recursion.
 
 Hm, yeah, so if the called inval function continues to use the message
 contents after doing something that could result in a recursive call,
 it might be looking at trashed data.

FWIW, with a bit of changes around it to make it more visible
(malloc/freeing the array) this sometimes triggers during make check. So
it's quite plausible that this is the caused the relfilenodemap
regression error we were a bit confused about.

I started to look at this code because valgrind was bleating at me
inside inval.c and for the heck of I couldn't understand wh. Since then
this lead me into quite a wild goose chase. Leading me to discover a
couple of things I am not perfectly happy about:

a) SICleanupQueue() sometimes releases and reacquires the write lock
   held on the outside. That's pretty damn fragile, not to mention
   ugly. Even slight reformulations of the code in SIInsertDataEntries()
   can break this... Can we please extend the comment in the latter that
   to mention the lock dropping explicitly?
b) we right/left shift -1 in a signed int by 16 in
   CacheInvalidateSmgr/LocalExecuteInvalidationMessage(). IIRC that's
   implementation defined behaviour.
c) The ProcessMessageList() calls access msg-rc.id to test for the type
   of the existing message. That's not nice. The compiler is entirely
   free to e.g. copy the entire struct to local memory causing
   unitialized memory to be accessed. Entirely cosmetic, but ...

d)
The valgrind splats I was very occasionally getting were:
==21013== Conditional jump or move depends on uninitialised value(s)
==21013==at 0x8DD755: hash_search_with_hash_value (dynahash.c:885)
==21013==by 0x8DD60F: hash_search (dynahash.c:811)
==21013==by 0x799A58: smgrclosenode (smgr.c:357)
==21013==by 0x8B6ABF: LocalExecuteInvalidationMessage (inval.c:566)
==21013==by 0x77BBCF: ReceiveSharedInvalidMessages (sinval.c:127)
==21013==by 0x8B6C3F: AcceptInvalidationMessages (inval.c:640)
==21013==by 0x77FDCC: LockRelationOid (lmgr.c:107)
==21013==by 0x4A6F33: relation_open (heapam.c:1029)
==21013==by 0x4A71EB: heap_open (heapam.c:1195)
==21013==by 0x8B4228: SearchCatCache (catcache.c:1223)
==21013==by 0x8C61B1: SearchSysCache (syscache.c:909)
==21013==by 0x8C62CD: GetSysCacheOid (syscache.c:987)
==21013==  Uninitialised value was created by a stack allocation
==21013==at 0x8B64C3: AddCatcacheInvalidationMessage (inval.c:328)

After far, far too much confused head scratching, code reading, random
elog()s et al I found out that this is just because of a deficiency in
valgrind's undefinedness tracking. The problem is that it doesn't really
understand shared memory sufficiently. When a backend stored a message
in the SI ringbuffer it sets a bitmask about initialized memory for a
specific position in the ringubffer. If the *same* backend reads from
the same position in the ringbuffer (4096 messages later) into which
another backend has stored a *different* type of message the bitmap of
initialized bytes will possibly be wrong if the padding is distributed
differently.

Unfortunately this cannot precisely be caught by valgrind's
suppressions. Thus I'd like to add memset(SharedInvalidationMessage msg,
0) in AddCatcacheInvalidationMessage() et al. to suppress these
warnings. Imo we can just add them unconditionally, but if somebody else
prefers we can add #ifdef USE_VALGRIND around them.

I can do a), c), d), if others agree but I am not sure about b)?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Recursive ReceiveSharedInvalidMessages not safe

2014-05-05 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 a) SICleanupQueue() sometimes releases and reacquires the write lock
held on the outside. That's pretty damn fragile, not to mention
ugly. Even slight reformulations of the code in SIInsertDataEntries()
can break this... Can we please extend the comment in the latter that
to mention the lock dropping explicitly?

Want to write a better comment?

 b) we right/left shift -1 in a signed int by 16 in
CacheInvalidateSmgr/LocalExecuteInvalidationMessage(). IIRC that's
implementation defined behaviour.

Looks all right to me.  Yeah, the right shift might have undefined
high-order bits, but we don't care because we're storing the result
into an int16.

 c) The ProcessMessageList() calls access msg-rc.id to test for the type
of the existing message. That's not nice.

Huh?

 After far, far too much confused head scratching, code reading, random
 elog()s et al I found out that this is just because of a deficiency in
 valgrind's undefinedness tracking. [...]
 Unfortunately this cannot precisely be caught by valgrind's
 suppressions. Thus I'd like to add memset(SharedInvalidationMessage msg,
 0) in AddCatcacheInvalidationMessage() et al. to suppress these
 warnings. Imo we can just add them unconditionally, but if somebody else
 prefers we can add #ifdef USE_VALGRIND around them.

I'd be okay with USE_VALGRIND.  I'm not particularly hot on adding a
memset for everybody just to make valgrind less confused.  Especially
since that's really going to hide any problems, not fix them.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Recursive ReceiveSharedInvalidMessages not safe

2014-05-05 Thread Andres Freund
On 2014-05-05 15:41:22 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  a) SICleanupQueue() sometimes releases and reacquires the write lock
 held on the outside. That's pretty damn fragile, not to mention
 ugly. Even slight reformulations of the code in SIInsertDataEntries()
 can break this... Can we please extend the comment in the latter that
 to mention the lock dropping explicitly?
 
 Want to write a better comment?

Will do.

  b) we right/left shift -1 in a signed int by 16 in
 CacheInvalidateSmgr/LocalExecuteInvalidationMessage(). IIRC that's
 implementation defined behaviour.
 
 Looks all right to me.  Yeah, the right shift might have undefined
 high-order bits, but we don't care because we're storing the result
 into an int16.

Doesn't at the very least
rnode.backend = (msg-sm.backend_hi  16) | (int) 
msg-sm.backend_lo;
have to be
rnode.backend = ((int) msg-sm.backend_hi  16) | (int) 
msg-sm.backend_lo;

  c) The ProcessMessageList() calls access msg-rc.id to test for the type
 of the existing message. That's not nice.
 
 Huh?

The checks should access msg-id, not msg-rc.id...

  After far, far too much confused head scratching, code reading, random
  elog()s et al I found out that this is just because of a deficiency in
  valgrind's undefinedness tracking. [...]
  Unfortunately this cannot precisely be caught by valgrind's
  suppressions. Thus I'd like to add memset(SharedInvalidationMessage msg,
  0) in AddCatcacheInvalidationMessage() et al. to suppress these
  warnings. Imo we can just add them unconditionally, but if somebody else
  prefers we can add #ifdef USE_VALGRIND around them.
 
 I'd be okay with USE_VALGRIND.  I'm not particularly hot on adding a
 memset for everybody just to make valgrind less confused.  Especially
 since that's really going to hide any problems, not fix them.

I can't see it having any measureable overhead. Even old compilers will
optimize the memset() away and just initialize the padding... But
anyway, USE_VALGRIND is fine with me.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Recursive ReceiveSharedInvalidMessages not safe

2014-05-05 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-05-05 15:41:22 -0400, Tom Lane wrote:
 Looks all right to me.  Yeah, the right shift might have undefined
 high-order bits, but we don't care because we're storing the result
 into an int16.

 Doesn't at the very least
   rnode.backend = (msg-sm.backend_hi  16) | (int) 
 msg-sm.backend_lo;
 have to be
   rnode.backend = ((int) msg-sm.backend_hi  16) | (int) 
 msg-sm.backend_lo;

A promotion to int (or wider) is implicit in any arithmetic operation,
last I checked the C standard.  The (int) on the other side isn't
necessary either.

We might actually be better off casting both sides to unsigned int,
just to enforce that the left shifting is done with unsigned semantics.

 c) The ProcessMessageList() calls access msg-rc.id to test for the type
 of the existing message. That's not nice.

 Huh?

 The checks should access msg-id, not msg-rc.id...

Meh.  If they're not the same thing, we've got big trouble anyway.
But if you want to change it as a cosmetic thing, no objection.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers