Re: [HACKERS] ERROR: cannot GetMultiXactIdMembers() during recovery
Alvaro Herrera alvhe...@2ndquadrant.com writes: Marko Tiikkaja wrote: Any chance to get this fixed in time for 9.1.16? I hope you had pinged some days earlier. Here's a patch, but I will wait until this week's releases have been tagged before pushing. BTW, I meant to update this thread but forgot until now: these changes did wind up included in the final tarballs for 9.2 and before, on account of the re-wrap the next day. In the rush to re-do the wrap, I forgot that I should've added entries to the release notes for these commits :-( So the documentation doesn't mention the fix, but it's there. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: cannot GetMultiXactIdMembers() during recovery
Alvaro Herrera alvhe...@2ndquadrant.com writes: Marko Tiikkaja wrote: Any chance to get this fixed in time for 9.1.16? I hope you had pinged some days earlier. Here's a patch, but I will wait until this week's releases have been tagged before pushing. Is this a recent regression, or has it been busted all along in those branches? If the former, maybe we should take the risk of fixing it today (the patch certainly looks safe enough). But if it's been this way a long time and nobody noticed till now, I'd agree with waiting. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: cannot GetMultiXactIdMembers() during recovery
On 2015-05-18 14:13:51 -0300, Alvaro Herrera wrote: Hmm, AFAICS the problematic check was introduced by this commit: commit 9f1e051adefb2f29e757cf426b03db20d3f8a26d Author: Alvaro Herrera alvhe...@alvh.no-ip.org Date: Fri Nov 29 11:26:41 2013 -0300 so it isn't hot off the oven, but it is a regression. Hasn't that just changed the symptoms? I don't recall exactly, but my recollection is that the multixact code isn't ready at that point and hasn't initialized a bunch of important variables yet. Leading to errors in the SLRU etc. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: cannot GetMultiXactIdMembers() during recovery
On 18 May 2015 at 12:59, Tom Lane t...@sss.pgh.pa.us wrote: Alvaro Herrera alvhe...@2ndquadrant.com writes: Marko Tiikkaja wrote: Any chance to get this fixed in time for 9.1.16? I hope you had pinged some days earlier. Here's a patch, but I will wait until this week's releases have been tagged before pushing. Is this a recent regression, or has it been busted all along in those branches? If the former, maybe we should take the risk of fixing it today (the patch certainly looks safe enough). But if it's been this way a long time and nobody noticed till now, I'd agree with waiting. That's a very low risk fix. It's more like a should-have-been-a-basic-check. -- Simon Riggshttp://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services
Re: [HACKERS] ERROR: cannot GetMultiXactIdMembers() during recovery
Andres Freund wrote: On 2015-05-18 14:13:51 -0300, Alvaro Herrera wrote: Hmm, AFAICS the problematic check was introduced by this commit: commit 9f1e051adefb2f29e757cf426b03db20d3f8a26d Author: Alvaro Herrera alvhe...@alvh.no-ip.org Date: Fri Nov 29 11:26:41 2013 -0300 so it isn't hot off the oven, but it is a regression. Hasn't that just changed the symptoms? I don't recall exactly, but my recollection is that the multixact code isn't ready at that point and hasn't initialized a bunch of important variables yet. Leading to errors in the SLRU etc. Not sure about that. The page limits etc aren't set yet so you can't create new multis, nor truncate appropriately, but just reading one should have worked. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: cannot GetMultiXactIdMembers() during recovery
Marko Tiikkaja wrote: Hi hackers, Any chance to get this fixed in time for 9.1.16? I hope you had pinged some days earlier. Here's a patch, but I will wait until this week's releases have been tagged before pushing. I checked 9.2, and it doesn't look like it's subject to the same problem: instead of HeapTupleSatisfiesVacuum, it uses HeapTupleIsSurelyDead in the equivalent place. Still, I think it's saner to apply the same bug because as Andres notes the problem might still be present in pgrowlocks and who knows what else. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c index 476c53d..b90c110 100644 --- a/src/backend/access/transam/multixact.c +++ b/src/backend/access/transam/multixact.c @@ -383,6 +383,21 @@ MultiXactIdIsRunning(MultiXactId multi) debug_elog3(DEBUG2, IsRunning %u?, multi); + /* + * During recovery, all multixacts can be considered not running: in + * effect, tuple locks are not held in standby servers, which is fine + * because the standby cannot acquire further tuple locks nor update/delete + * tuples. + * + * We need to do this first, because GetMultiXactIdMembers complains if + * called on recovery. + */ + if (RecoveryInProgress()) + { + debug_elog2(DEBUG2, IsRunning: in recovery); + return false; + } + nmembers = GetMultiXactIdMembers(multi, members); if (nmembers 0) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: cannot GetMultiXactIdMembers() during recovery
On 2015-05-18 12:59:47 -0400, Tom Lane wrote: If the former, maybe we should take the risk of fixing it today (the patch certainly looks safe enough). But if it's been this way a long time and nobody noticed till now, I'd agree with waiting. It's a old regression, and nobody noticed it until Marko a couple months back. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: cannot GetMultiXactIdMembers() during recovery
Tom Lane wrote: Alvaro Herrera alvhe...@2ndquadrant.com writes: Marko Tiikkaja wrote: Any chance to get this fixed in time for 9.1.16? I hope you had pinged some days earlier. Here's a patch, but I will wait until this week's releases have been tagged before pushing. Is this a recent regression, or has it been busted all along in those branches? If the former, maybe we should take the risk of fixing it today (the patch certainly looks safe enough). But if it's been this way a long time and nobody noticed till now, I'd agree with waiting. Hmm, AFAICS the problematic check was introduced by this commit: commit 9f1e051adefb2f29e757cf426b03db20d3f8a26d Author: Alvaro Herrera alvhe...@alvh.no-ip.org Date: Fri Nov 29 11:26:41 2013 -0300 so it isn't hot off the oven, but it is a regression. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: cannot GetMultiXactIdMembers() during recovery
Simon Riggs wrote: On 15 May 2015 at 19:03, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Andres Freund wrote: Alternatively we could make MultiXactIdIsRunning() return false 9.3 when in recovery. I think that'd end up fixing things, but it seems awfully fragile to me. Hm, why fragile? It seems a pretty decent answer -- pre-9.3, it's not possible for a tuple to be locked in recovery, is it? I mean, in the standby you can't lock it nor update it; the only thing you can do is read (select), and that is not affected by whether there is a multixact in it. It can't return true and won't ever change for 9.3 so I don't see what the objection is. Pushed. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: cannot GetMultiXactIdMembers() during recovery
Hi hackers, Any chance to get this fixed in time for 9.1.16? .m -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: cannot GetMultiXactIdMembers() during recovery
On 15 May 2015 at 19:03, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Andres Freund wrote: Alternatively we could make MultiXactIdIsRunning() return false 9.3 when in recovery. I think that'd end up fixing things, but it seems awfully fragile to me. Hm, why fragile? It seems a pretty decent answer -- pre-9.3, it's not possible for a tuple to be locked in recovery, is it? I mean, in the standby you can't lock it nor update it; the only thing you can do is read (select), and that is not affected by whether there is a multixact in it. It can't return true and won't ever change for 9.3 so I don't see what the objection is. -- Simon Riggshttp://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services
Re: [HACKERS] ERROR: cannot GetMultiXactIdMembers() during recovery
Andres Freund wrote: Alternatively we could make MultiXactIdIsRunning() return false 9.3 when in recovery. I think that'd end up fixing things, but it seems awfully fragile to me. Hm, why fragile? It seems a pretty decent answer -- pre-9.3, it's not possible for a tuple to be locked in recovery, is it? I mean, in the standby you can't lock it nor update it; the only thing you can do is read (select), and that is not affected by whether there is a multixact in it. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] ERROR: cannot GetMultiXactIdMembers() during recovery
Hi, Andres asked me on IRC to report this here. Since we upgraded our standby servers to 9.1.15 (though the master is still running 9.1.14), we've seen the error in $SUBJECT a number of times. I managed to reproduce it today by running the same query over and over again, and attached is the back trace. Let me know if you need any additional information. .m #0 GetMultiXactIdMembers (multi=56513428, xids=0x7fff9e3691e8) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/access/transam/multixact.c:923 #1 0x7f029a917dd4 in MultiXactIdIsRunning (multi=optimized out) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/access/transam/multixact.c:386 #2 0x7f029abad79d in HeapTupleSatisfiesVacuum (tuple=0x7f0145f2c060, OldestXmin=2418699920, buffer=111945) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/utils/time/tqual.c:1184 #3 0x7f029a8f08f5 in index_getnext (scan=scan@entry=0x7f029b66ac78, direction=direction@entry=ForwardScanDirection) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/access/index/indexam.c:644 #4 0x7f029a9fe646 in IndexNext (node=node@entry=0x7f029b669550) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/nodeIndexscan.c:78 #5 0x7f029a9f41dc in ExecScanFetch (recheckMtd=0x7f029a9fe5c0 IndexRecheck, accessMtd=0x7f029a9fe600 IndexNext, node=0x7f029b669550) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/execScan.c:82 #6 ExecScan (node=node@entry=0x7f029b669550, accessMtd=accessMtd@entry=0x7f029a9fe600 IndexNext, recheckMtd=recheckMtd@entry=0x7f029a9fe5c0 IndexRecheck) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/execScan.c:167 #7 0x7f029a9fe72b in ExecIndexScan (node=node@entry=0x7f029b669550) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/nodeIndexscan.c:146 #8 0x7f029a9ec968 in ExecProcNode (node=node@entry=0x7f029b669550) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/execProcnode.c:398 #9 0x7f029a9fd3e1 in MultiExecHash (node=node@entry=0x7f029b6690b0) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/nodeHash.c:103 #10 0x7f029a9eca74 in MultiExecProcNode (node=node@entry=0x7f029b6690b0) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/execProcnode.c:536 #11 0x7f029a9fdd94 in ExecHashJoin (node=node@entry=0x7f029b667aa0) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/nodeHashjoin.c:177 #12 0x7f029a9ec8b8 in ExecProcNode (node=node@entry=0x7f029b667aa0) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/execProcnode.c:447 #13 0x7f029a9fe319 in ExecHashJoinOuterGetTuple (hashvalue=0x7fff9e369554, hjstate=0x7f029b666c80, outerNode=0x7f029b667aa0) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/nodeHashjoin.c:656 #14 ExecHashJoin (node=node@entry=0x7f029b666c80) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/nodeHashjoin.c:209 #15 0x7f029a9ec8b8 in ExecProcNode (node=node@entry=0x7f029b666c80) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/execProcnode.c:447 #16 0x7f029aa05a59 in ExecSort (node=node@entry=0x7f029b666a10) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/nodeSort.c:103 #17 0x7f029a9ec898 in ExecProcNode (node=node@entry=0x7f029b666a10) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/execProcnode.c:458 #18 0x7f029aa096e0 in begin_partition (winstate=winstate@entry=0x7f029b665530) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/nodeWindowAgg.c:683 #19 0x7f029aa0b49b in ExecWindowAgg (winstate=winstate@entry=0x7f029b665530) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/nodeWindowAgg.c:1287 #20 0x7f029a9ec868 in ExecProcNode (node=0x7f029b665530) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/execProcnode.c:470 #21 0x7f029a9f41dc in ExecScanFetch (recheckMtd=0x7f029aa084f0 SubqueryRecheck, accessMtd=0x7f029aa08500 SubqueryNext, node=0x7f029b664e60) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/execScan.c:82 #22 ExecScan (node=node@entry=0x7f029b664e60, accessMtd=accessMtd@entry=0x7f029aa08500 SubqueryNext, recheckMtd=recheckMtd@entry=0x7f029aa084f0 SubqueryRecheck) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/execScan.c:167 #23 0x7f029aa08538 in ExecSubqueryScan (node=node@entry=0x7f029b664e60) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/nodeSubqueryscan.c:85 #24 0x7f029a9ec938 in ExecProcNode (node=node@entry=0x7f029b664e60) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/execProcnode.c:412 #25 0x7f029aa05a59 in ExecSort (node=node@entry=0x7f029b664bf0) at /tmp/buildd/postgresql-9.1-9.1.15/build/../src/backend/executor/nodeSort.c:103 #26 0x7f029a9ec898 in ExecProcNode
Re: [HACKERS] ERROR: cannot GetMultiXactIdMembers() during recovery
Hi, On 2015-02-23 15:00:35 +0100, Marko Tiikkaja wrote: Andres asked me on IRC to report this here. Since we upgraded our standby servers to 9.1.15 (though the master is still running 9.1.14), we've seen the error in $SUBJECT a number of times. FWIW, I think this is just as borked in 9.1.14 and will likely affect all of 9.0 - 9.2. The problem is that in those releases multixacts aren't maintained on the standby in a way that allows access. index_getnext() itself is actually pretty easy to fix, it already checks whether the scan started while in recovery when using the result of the error triggering HeapTupleSatisfiesVacuum(), just too late. I don't remember other HTSV callers that can run in recovery, given that DDL is obviously impossible and we don't support serializable while in recovery. Alternatively we could make MultiXactIdIsRunning() return false 9.3 when in recovery. I think that'd end up fixing things, but it seems awfully fragile to me. I do see a HTSU in pgrowlocks.c - that's not really safe during recovery 9.3, given it accesses multixacts. I guess it needs to throw an error. I wonder if we shouldn't put a Assert() in HTSV/HTSU to prevent such problems. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers