Re: [HACKERS] Recursive ReceiveSharedInvalidMessages not safe
On 2014-05-05 15:41:22 -0400, Tom Lane wrote: After far, far too much confused head scratching, code reading, random elog()s et al I found out that this is just because of a deficiency in valgrind's undefinedness tracking. [...] Unfortunately this cannot precisely be caught by valgrind's suppressions. Thus I'd like to add memset(SharedInvalidationMessage msg, 0) in AddCatcacheInvalidationMessage() et al. to suppress these warnings. Imo we can just add them unconditionally, but if somebody else prefers we can add #ifdef USE_VALGRIND around them. I'd be okay with USE_VALGRIND. I'm not particularly hot on adding a memset for everybody just to make valgrind less confused. Especially since that's really going to hide any problems, not fix them. The attached patch does VALGRIND_MAKE_MEM_DEFINED() on the relevant structs. That's already defined to empty if USE_VALGRIND isn't defined. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services From 4da7b0caa2058c8b07ee4a7ab529b9ca0c292bcf Mon Sep 17 00:00:00 2001 From: Andres Freund and...@anarazel.de Date: Wed, 7 May 2014 22:30:05 +0200 Subject: [PATCH] Silence a spurious valgrind warning in inval.c. For valgrind's sake explicitly define all bytes of SharedInvalidationMessage as defined before sending them. Otherwise valgrind sometimes treats bytes as undefined that aren't because valgrind doesn't understand that the ringbuffer in sinvaladt.c is accesses by multiple processes. Valgrind remembers that it stored a undefined byte into parts of a slot in the ringbuffer and warns when it uses that byte again - not realizing it has been stored by another process. --- src/backend/utils/cache/inval.c | 24 1 file changed, 24 insertions(+) diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c index 5971469..dd46e18 100644 --- a/src/backend/utils/cache/inval.c +++ b/src/backend/utils/cache/inval.c @@ -103,6 +103,7 @@ #include storage/smgr.h #include utils/catcache.h #include utils/inval.h +#include utils/memdebug.h #include utils/memutils.h #include utils/rel.h #include utils/relmapper.h @@ -332,6 +333,14 @@ AddCatcacheInvalidationMessage(InvalidationListHeader *hdr, msg.cc.id = (int8) id; msg.cc.dbId = dbId; msg.cc.hashValue = hashValue; + /* + * Define padding bytes to be defined, otherwise the sinvaladt.c + * ringbuffer, which is accessed by multiple processes, will cause false + * positives because valgrind remembers the undefined bytes from this + * processes store, not realizing that another process has written since. + */ + VALGRIND_MAKE_MEM_DEFINED(msg, sizeof(msg)); + AddInvalidationMessage(hdr-cclist, msg); } @@ -347,6 +356,9 @@ AddCatalogInvalidationMessage(InvalidationListHeader *hdr, msg.cat.id = SHAREDINVALCATALOG_ID; msg.cat.dbId = dbId; msg.cat.catId = catId; + /* check AddCatcacheInvalidationMessage() for an explanation */ + VALGRIND_MAKE_MEM_DEFINED(msg, sizeof(msg)); + AddInvalidationMessage(hdr-cclist, msg); } @@ -370,6 +382,9 @@ AddRelcacheInvalidationMessage(InvalidationListHeader *hdr, msg.rc.id = SHAREDINVALRELCACHE_ID; msg.rc.dbId = dbId; msg.rc.relId = relId; + /* check AddCatcacheInvalidationMessage() for an explanation */ + VALGRIND_MAKE_MEM_DEFINED(msg, sizeof(msg)); + AddInvalidationMessage(hdr-rclist, msg); } @@ -393,6 +408,9 @@ AddSnapshotInvalidationMessage(InvalidationListHeader *hdr, msg.sn.id = SHAREDINVALSNAPSHOT_ID; msg.sn.dbId = dbId; msg.sn.relId = relId; + /* check AddCatcacheInvalidationMessage() for an explanation */ + VALGRIND_MAKE_MEM_DEFINED(msg, sizeof(msg)); + AddInvalidationMessage(hdr-rclist, msg); } @@ -1242,6 +1260,9 @@ CacheInvalidateSmgr(RelFileNodeBackend rnode) msg.sm.backend_hi = rnode.backend 16; msg.sm.backend_lo = rnode.backend 0x; msg.sm.rnode = rnode.node; + /* check AddCatcacheInvalidationMessage() for an explanation */ + VALGRIND_MAKE_MEM_DEFINED(msg, sizeof(msg)); + SendSharedInvalidMessages(msg, 1); } @@ -1267,6 +1288,9 @@ CacheInvalidateRelmap(Oid databaseId) msg.rm.id = SHAREDINVALRELMAP_ID; msg.rm.dbId = databaseId; + /* check AddCatcacheInvalidationMessage() for an explanation */ + VALGRIND_MAKE_MEM_DEFINED(msg, sizeof(msg)); + SendSharedInvalidMessages(msg, 1); } -- 1.8.5.rc2.dirty -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recursive ReceiveSharedInvalidMessages not safe
On 2014-05-05 18:50:59 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-05-05 15:41:22 -0400, Tom Lane wrote: Looks all right to me. Yeah, the right shift might have undefined high-order bits, but we don't care because we're storing the result into an int16. Doesn't at the very least rnode.backend = (msg-sm.backend_hi 16) | (int) msg-sm.backend_lo; have to be rnode.backend = ((int) msg-sm.backend_hi 16) | (int) msg-sm.backend_lo; A promotion to int (or wider) is implicit in any arithmetic operation, last I checked the C standard. The (int) on the other side isn't necessary either. Done in the attached patch. We might actually be better off casting both sides to unsigned int, just to enforce that the left shifting is done with unsigned semantics. c) The ProcessMessageList() calls access msg-rc.id to test for the type of the existing message. That's not nice. Huh? The checks should access msg-id, not msg-rc.id... Meh. If they're not the same thing, we've got big trouble anyway. But if you want to change it as a cosmetic thing, no objection. Changed as well. On 2014-05-05 15:41:22 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: a) SICleanupQueue() sometimes releases and reacquires the write lock held on the outside. That's pretty damn fragile, not to mention ugly. Even slight reformulations of the code in SIInsertDataEntries() can break this... Can we please extend the comment in the latter that to mention the lock dropping explicitly? Want to write a better comment? Edited the comment and, in a perhaps debatable fashion, moved some variable declarations + volatile into the retry loop for some added cozyness. If the compiler inlines the cleanup function it could very well decide to fetch some of the variables only once. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services From a24973508710c965035fca45e933823ce49a5a4f Mon Sep 17 00:00:00 2001 From: Andres Freund and...@anarazel.de Date: Thu, 8 May 2014 16:13:44 +0200 Subject: [PATCH] Minor cleanups and comment improvements in the cache invalidation code. --- src/backend/storage/ipc/sinvaladt.c | 11 --- src/backend/utils/cache/inval.c | 9 + 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c index 0328660..3966ccc 100644 --- a/src/backend/storage/ipc/sinvaladt.c +++ b/src/backend/storage/ipc/sinvaladt.c @@ -453,7 +453,6 @@ SIInsertDataEntries(const SharedInvalidationMessage *data, int n) while (n 0) { int nthistime = Min(n, WRITE_QUANTUM); - int numMsgs; int max; int i; @@ -466,11 +465,17 @@ SIInsertDataEntries(const SharedInvalidationMessage *data, int n) * queue and reset anyone who is preventing space from being freed. * Otherwise, clean the queue only when it's exceeded the next * fullness threshold. We have to loop and recheck the buffer state - * after any call of SICleanupQueue. + * after any call of SICleanupQueue because the cleanup sometimes + * require releasing the write lock acquired above. */ for (;;) { - numMsgs = segP-maxMsgNum - segP-minMsgNum; + int numMsgs; + /* use volatile pointer to prevent code rearrangement */ + volatile SISeg *vsegP = segP; + + numMsgs = vsegP-maxMsgNum - vsegP-minMsgNum; + if (numMsgs + nthistime MAXNUMMESSAGES || numMsgs = segP-nextThreshold) SICleanupQueue(true, nthistime); diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c index dd46e18..4c3197b 100644 --- a/src/backend/utils/cache/inval.c +++ b/src/backend/utils/cache/inval.c @@ -374,7 +374,7 @@ AddRelcacheInvalidationMessage(InvalidationListHeader *hdr, /* Don't add a duplicate item */ /* We assume dbId need not be checked because it will never change */ ProcessMessageList(hdr-rclist, - if (msg-rc.id == SHAREDINVALRELCACHE_ID + if (msg-id == SHAREDINVALRELCACHE_ID msg-rc.relId == relId) return); @@ -400,7 +400,7 @@ AddSnapshotInvalidationMessage(InvalidationListHeader *hdr, /* Don't add a duplicate item */ /* We assume dbId need not be checked because it will never change */ ProcessMessageList(hdr-rclist, - if (msg-sn.id == SHAREDINVALSNAPSHOT_ID + if (msg-id == SHAREDINVALSNAPSHOT_ID msg-sn.relId == relId) return); @@ -580,7 +580,8 @@ LocalExecuteInvalidationMessage(SharedInvalidationMessage *msg) RelFileNodeBackend rnode; rnode.node = msg-sm.rnode; - rnode.backend = (msg-sm.backend_hi 16) | (int) msg-sm.backend_lo; + rnode.backend = (BackendId) (((uint32) msg-sm.backend_hi 16) + |(uint32) msg-sm.backend_lo); smgrclosenode(rnode); } else if (msg-id == SHAREDINVALRELMAP_ID) @@ -1257,7
[HACKERS] Recursive ReceiveSharedInvalidMessages not safe
Hi, While investigating an issue pointed out by valgrind around undefined bytes in inval.c SHAREDINVALSMGR_ID processing I noticed that there's a bug in ReceiveSharedInvalidMessages(). It tries to be safe against recursion but it's not: When it recurses into ReceiveSharedInvalidMessages() from it's main loop from inside the inval callback while nextmsg = nummsgs it'll overwrite the 'messages' array with new contents. But at this point the old content of one entry in the messages array is still passed to the LocalExecuteInvalidationMessage() that caused the recursion. It looks to me like SHAREDINVALRELCACHE_ID messages are the only ones actually affected by this in reality because all the other ones either operate on stack memory or don't recurse. What could happen there is that a different relcache entry is invalidated than the relcache invalidation callback is called for. This actually could explain the relfilenodemap error we've seen a couple weeks back. It looks to me like this is broken since at least fad153ec. I think the fix is just to make the current 'SharedInvalidationMessage *msg' not be pointers but a local copiy of the to-be-processed entry. Comments? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recursive ReceiveSharedInvalidMessages not safe
Andres Freund and...@2ndquadrant.com writes: While investigating an issue pointed out by valgrind around undefined bytes in inval.c SHAREDINVALSMGR_ID processing I noticed that there's a bug in ReceiveSharedInvalidMessages(). It tries to be safe against recursion but it's not: When it recurses into ReceiveSharedInvalidMessages() from it's main loop from inside the inval callback while nextmsg = nummsgs it'll overwrite the 'messages' array with new contents. But at this point the old content of one entry in the messages array is still passed to the LocalExecuteInvalidationMessage() that caused the recursion. Hm, yeah, so if the called inval function continues to use the message contents after doing something that could result in a recursive call, it might be looking at trashed data. It looks to me like this is broken since at least fad153ec. I think the fix is just to make the current 'SharedInvalidationMessage *msg' not be pointers but a local copiy of the to-be-processed entry. Yeah, that should do it. I think I'd been trying to avoid copying messages more times than necessary, but evidently I optimized away one copy step too many :-( regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recursive ReceiveSharedInvalidMessages not safe
On 2014-05-05 14:15:58 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: While investigating an issue pointed out by valgrind around undefined bytes in inval.c SHAREDINVALSMGR_ID processing I noticed that there's a bug in ReceiveSharedInvalidMessages(). It tries to be safe against recursion but it's not: When it recurses into ReceiveSharedInvalidMessages() from it's main loop from inside the inval callback while nextmsg = nummsgs it'll overwrite the 'messages' array with new contents. But at this point the old content of one entry in the messages array is still passed to the LocalExecuteInvalidationMessage() that caused the recursion. Hm, yeah, so if the called inval function continues to use the message contents after doing something that could result in a recursive call, it might be looking at trashed data. FWIW, with a bit of changes around it to make it more visible (malloc/freeing the array) this sometimes triggers during make check. So it's quite plausible that this is the caused the relfilenodemap regression error we were a bit confused about. I started to look at this code because valgrind was bleating at me inside inval.c and for the heck of I couldn't understand wh. Since then this lead me into quite a wild goose chase. Leading me to discover a couple of things I am not perfectly happy about: a) SICleanupQueue() sometimes releases and reacquires the write lock held on the outside. That's pretty damn fragile, not to mention ugly. Even slight reformulations of the code in SIInsertDataEntries() can break this... Can we please extend the comment in the latter that to mention the lock dropping explicitly? b) we right/left shift -1 in a signed int by 16 in CacheInvalidateSmgr/LocalExecuteInvalidationMessage(). IIRC that's implementation defined behaviour. c) The ProcessMessageList() calls access msg-rc.id to test for the type of the existing message. That's not nice. The compiler is entirely free to e.g. copy the entire struct to local memory causing unitialized memory to be accessed. Entirely cosmetic, but ... d) The valgrind splats I was very occasionally getting were: ==21013== Conditional jump or move depends on uninitialised value(s) ==21013==at 0x8DD755: hash_search_with_hash_value (dynahash.c:885) ==21013==by 0x8DD60F: hash_search (dynahash.c:811) ==21013==by 0x799A58: smgrclosenode (smgr.c:357) ==21013==by 0x8B6ABF: LocalExecuteInvalidationMessage (inval.c:566) ==21013==by 0x77BBCF: ReceiveSharedInvalidMessages (sinval.c:127) ==21013==by 0x8B6C3F: AcceptInvalidationMessages (inval.c:640) ==21013==by 0x77FDCC: LockRelationOid (lmgr.c:107) ==21013==by 0x4A6F33: relation_open (heapam.c:1029) ==21013==by 0x4A71EB: heap_open (heapam.c:1195) ==21013==by 0x8B4228: SearchCatCache (catcache.c:1223) ==21013==by 0x8C61B1: SearchSysCache (syscache.c:909) ==21013==by 0x8C62CD: GetSysCacheOid (syscache.c:987) ==21013== Uninitialised value was created by a stack allocation ==21013==at 0x8B64C3: AddCatcacheInvalidationMessage (inval.c:328) After far, far too much confused head scratching, code reading, random elog()s et al I found out that this is just because of a deficiency in valgrind's undefinedness tracking. The problem is that it doesn't really understand shared memory sufficiently. When a backend stored a message in the SI ringbuffer it sets a bitmask about initialized memory for a specific position in the ringubffer. If the *same* backend reads from the same position in the ringbuffer (4096 messages later) into which another backend has stored a *different* type of message the bitmap of initialized bytes will possibly be wrong if the padding is distributed differently. Unfortunately this cannot precisely be caught by valgrind's suppressions. Thus I'd like to add memset(SharedInvalidationMessage msg, 0) in AddCatcacheInvalidationMessage() et al. to suppress these warnings. Imo we can just add them unconditionally, but if somebody else prefers we can add #ifdef USE_VALGRIND around them. I can do a), c), d), if others agree but I am not sure about b)? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recursive ReceiveSharedInvalidMessages not safe
Andres Freund and...@2ndquadrant.com writes: a) SICleanupQueue() sometimes releases and reacquires the write lock held on the outside. That's pretty damn fragile, not to mention ugly. Even slight reformulations of the code in SIInsertDataEntries() can break this... Can we please extend the comment in the latter that to mention the lock dropping explicitly? Want to write a better comment? b) we right/left shift -1 in a signed int by 16 in CacheInvalidateSmgr/LocalExecuteInvalidationMessage(). IIRC that's implementation defined behaviour. Looks all right to me. Yeah, the right shift might have undefined high-order bits, but we don't care because we're storing the result into an int16. c) The ProcessMessageList() calls access msg-rc.id to test for the type of the existing message. That's not nice. Huh? After far, far too much confused head scratching, code reading, random elog()s et al I found out that this is just because of a deficiency in valgrind's undefinedness tracking. [...] Unfortunately this cannot precisely be caught by valgrind's suppressions. Thus I'd like to add memset(SharedInvalidationMessage msg, 0) in AddCatcacheInvalidationMessage() et al. to suppress these warnings. Imo we can just add them unconditionally, but if somebody else prefers we can add #ifdef USE_VALGRIND around them. I'd be okay with USE_VALGRIND. I'm not particularly hot on adding a memset for everybody just to make valgrind less confused. Especially since that's really going to hide any problems, not fix them. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recursive ReceiveSharedInvalidMessages not safe
On 2014-05-05 15:41:22 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: a) SICleanupQueue() sometimes releases and reacquires the write lock held on the outside. That's pretty damn fragile, not to mention ugly. Even slight reformulations of the code in SIInsertDataEntries() can break this... Can we please extend the comment in the latter that to mention the lock dropping explicitly? Want to write a better comment? Will do. b) we right/left shift -1 in a signed int by 16 in CacheInvalidateSmgr/LocalExecuteInvalidationMessage(). IIRC that's implementation defined behaviour. Looks all right to me. Yeah, the right shift might have undefined high-order bits, but we don't care because we're storing the result into an int16. Doesn't at the very least rnode.backend = (msg-sm.backend_hi 16) | (int) msg-sm.backend_lo; have to be rnode.backend = ((int) msg-sm.backend_hi 16) | (int) msg-sm.backend_lo; c) The ProcessMessageList() calls access msg-rc.id to test for the type of the existing message. That's not nice. Huh? The checks should access msg-id, not msg-rc.id... After far, far too much confused head scratching, code reading, random elog()s et al I found out that this is just because of a deficiency in valgrind's undefinedness tracking. [...] Unfortunately this cannot precisely be caught by valgrind's suppressions. Thus I'd like to add memset(SharedInvalidationMessage msg, 0) in AddCatcacheInvalidationMessage() et al. to suppress these warnings. Imo we can just add them unconditionally, but if somebody else prefers we can add #ifdef USE_VALGRIND around them. I'd be okay with USE_VALGRIND. I'm not particularly hot on adding a memset for everybody just to make valgrind less confused. Especially since that's really going to hide any problems, not fix them. I can't see it having any measureable overhead. Even old compilers will optimize the memset() away and just initialize the padding... But anyway, USE_VALGRIND is fine with me. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recursive ReceiveSharedInvalidMessages not safe
Andres Freund and...@2ndquadrant.com writes: On 2014-05-05 15:41:22 -0400, Tom Lane wrote: Looks all right to me. Yeah, the right shift might have undefined high-order bits, but we don't care because we're storing the result into an int16. Doesn't at the very least rnode.backend = (msg-sm.backend_hi 16) | (int) msg-sm.backend_lo; have to be rnode.backend = ((int) msg-sm.backend_hi 16) | (int) msg-sm.backend_lo; A promotion to int (or wider) is implicit in any arithmetic operation, last I checked the C standard. The (int) on the other side isn't necessary either. We might actually be better off casting both sides to unsigned int, just to enforce that the left shifting is done with unsigned semantics. c) The ProcessMessageList() calls access msg-rc.id to test for the type of the existing message. That's not nice. Huh? The checks should access msg-id, not msg-rc.id... Meh. If they're not the same thing, we've got big trouble anyway. But if you want to change it as a cosmetic thing, no objection. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers