Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2016-06-24 Thread Alvaro Herrera
After some further testing, I noticed a case that wasn't handled in heap_update, which I also fixed. I reworded some comments here and there, and pushed to all branches. Further testing and analysis is welcome. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2016-06-22 Thread Alvaro Herrera
Robert Haas wrote: > I see the patch, but I don't see much explanation of why the patch is > correct, which I think is pretty scary in view of the number of > mistakes we've already made in this area. The comments just say: > > + * A tuple that has HEAP_XMAX_IS_MULTI and HEAP_XMAX_LOCK_ONLY but

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2016-06-21 Thread Alvaro Herrera
Alvaro Herrera wrote: > Robert Haas wrote: > > On Fri, Jun 17, 2016 at 9:33 AM, Andrew Gierth > > wrote: > > >> "Robert" == Robert Haas writes: > > > >> Why is the correct rule not "check for and ignore pre-upgrade mxids > > > >> before

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2016-06-21 Thread Alvaro Herrera
Robert Haas wrote: > On Fri, Jun 17, 2016 at 9:33 AM, Andrew Gierth > wrote: > >> "Robert" == Robert Haas writes: > > >> Why is the correct rule not "check for and ignore pre-upgrade mxids > > >> before even trying to fetch members"? > >

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2016-06-21 Thread Robert Haas
On Fri, Jun 17, 2016 at 9:33 AM, Andrew Gierth wrote: >> "Robert" == Robert Haas writes: > >> Why is the correct rule not "check for and ignore pre-upgrade mxids > >> before even trying to fetch members"? > > Robert> I entirely believe

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2016-06-17 Thread Alvaro Herrera
Alvaro Herrera wrote: > Andrew Gierth wrote: > > Why is the correct rule not "check for and ignore pre-upgrade mxids > > before even trying to fetch members"? > > I propose something like the attached patch, which implements that idea. Here's a backpatch of that to 9.3 and 9.4. I tested this

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2016-06-17 Thread Alvaro Herrera
Andrew Gierth wrote: > > "Alvaro" == Alvaro Herrera writes: > > >> (It can, AFAICT, be inside the currently valid range due to > >> wraparound, i.e. without there being a valid pg_multixact entry for > >> it, because AFAICT in 9.2, once the mxid is hinted dead it

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2016-06-17 Thread Andrew Gierth
> "Robert" == Robert Haas writes: >> Why is the correct rule not "check for and ignore pre-upgrade mxids >> before even trying to fetch members"? Robert> I entirely believe that's the correct rule, but doesn't Robert> implementing it require a crystal balll? Why

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2016-06-17 Thread Robert Haas
On Thu, Jun 16, 2016 at 4:50 AM, Andrew Gierth wrote: > Why is the correct rule not "check for and ignore pre-upgrade mxids > before even trying to fetch members"? I entirely believe that's the correct rule, but doesn't implementing it require a crystal balll? --

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2016-06-17 Thread Andrew Gierth
> "Alvaro" == Alvaro Herrera writes: >> (It can, AFAICT, be inside the currently valid range due to >> wraparound, i.e. without there being a valid pg_multixact entry for >> it, because AFAICT in 9.2, once the mxid is hinted dead it is never >> again either

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2016-06-16 Thread Alvaro Herrera
Alvaro Herrera wrote: > Andrew Gierth wrote: > > Why is the correct rule not "check for and ignore pre-upgrade mxids > > before even trying to fetch members"? > > I propose something like the attached patch, which implements that idea. Hm, this doesn't apply cleanly to 9.4. I'll need to come

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2016-06-16 Thread Alvaro Herrera
Andrew Gierth wrote: > But that leaves an obvious third issue: it's all very well to ignore the > pre-upgrade (pre-9.3) mxid if it's older than the cutoff or it's in the > future, but what if it's actually inside the currently valid range? > Looking it up as though it were a valid mxid in that

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2016-06-16 Thread Andrew Gierth
> "Alvaro" == Alvaro Herrera writes: Alvaro> I think that was a good choice in general so that Alvaro> possibly-data-eating bugs could be reported, but there's a Alvaro> problem in the specific case of tuples carried over by Alvaro> pg_upgrade whose Multixact is

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2016-06-15 Thread Alvaro Herrera
Stephen Frost wrote: > Greetings, > > Looks like we might not be entirely out of the woods yet regarding > MultiXactId's. After doing an upgrade from 9.2.6 to 9.3.4, we saw the > following: > > ERROR: MultiXactId 6849409 has not been created yet -- apparent wraparound > > The table

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-04-23 Thread Alvaro Herrera
Andres Freund wrote: On 2014-03-31 08:54:53 -0300, Alvaro Herrera wrote: My conclusion here is that some part of the code is failing to examine XMAX_INVALID before looking at the value stored in xmax itself. There ought to be a short-circuit. Fortunately, this bug should be pretty

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-04-23 Thread Bruce Momjian
On Wed, Apr 23, 2014 at 03:01:02PM -0300, Alvaro Herrera wrote: Andres Freund wrote: On 2014-03-31 08:54:53 -0300, Alvaro Herrera wrote: My conclusion here is that some part of the code is failing to examine XMAX_INVALID before looking at the value stored in xmax itself. There ought

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-04-23 Thread Alvaro Herrera
Bruce Momjian wrote: On Wed, Apr 23, 2014 at 03:01:02PM -0300, Alvaro Herrera wrote: Andres Freund wrote: On 2014-03-31 08:54:53 -0300, Alvaro Herrera wrote: My conclusion here is that some part of the code is failing to examine XMAX_INVALID before looking at the value stored in xmax

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-04-23 Thread Bruce Momjian
On Wed, Apr 23, 2014 at 03:42:14PM -0300, Alvaro Herrera wrote: I still don't know under what circumstances this situation could arise. This seems most strange to me. I would wonder about this to be just papering over a different bug elsewhere, except that we know this tuple comes

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-04-23 Thread Alvaro Herrera
Bruce Momjian wrote: On Wed, Apr 23, 2014 at 03:42:14PM -0300, Alvaro Herrera wrote: I still don't know under what circumstances this situation could arise. This seems most strange to me. I would wonder about this to be just papering over a different bug elsewhere, except that we

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-04-23 Thread Andres Freund
On April 23, 2014 8:51:21 PM CEST, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Bruce Momjian wrote: On Wed, Apr 23, 2014 at 03:42:14PM -0300, Alvaro Herrera wrote: I still don't know under what circumstances this situation could arise. This seems most strange to me. I would wonder

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-04-23 Thread Alvaro Herrera
Andres Freund wrote: I think this patch is a seriously bad idea. For one, it's not actually doing anything about the problem - the tuple can be accessed without freezing getting involved. Normal access other than freeze is not a problem, because other code paths do check for HEAP_XMAX_INVALID

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-04-23 Thread Andres Freund
On 2014-04-23 16:30:05 -0300, Alvaro Herrera wrote: Andres Freund wrote: I think this patch is a seriously bad idea. For one, it's not actually doing anything about the problem - the tuple can be accessed without freezing getting involved. Normal access other than freeze is not a

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-04-22 Thread Bruce Momjian
On Mon, Mar 31, 2014 at 09:36:03AM -0400, Stephen Frost wrote: Andres, * Andres Freund (and...@2ndquadrant.com) wrote: Without having looked at the code, IIRC this looks like some place misses passing allow_old=true where it's actually required. Any chance you can get a backtrace for the

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-03-31 Thread Andres Freund
Hi, On 2014-03-30 00:00:30 -0400, Stephen Frost wrote: Greetings, Looks like we might not be entirely out of the woods yet regarding MultiXactId's. After doing an upgrade from 9.2.6 to 9.3.4, we saw the following: ERROR: MultiXactId 6849409 has not been created yet -- apparent

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-03-31 Thread Alvaro Herrera
Stephen Frost wrote: * Stephen Frost (sfr...@snowman.net) wrote: I have the pre-upgrade database and can upgrade/rollback/etc that pretty easily. Note that the table contents weren't changed during the upgrade, of course, and so the 9.2.6 instance has HEAP_XMAX_IS_MULTI set while t_xmax

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-03-31 Thread Andres Freund
On 2014-03-31 08:54:53 -0300, Alvaro Herrera wrote: My conclusion here is that some part of the code is failing to examine XMAX_INVALID before looking at the value stored in xmax itself. There ought to be a short-circuit. Fortunately, this bug should be pretty harmless. .. and after

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-03-31 Thread Alvaro Herrera
Andres Freund wrote: On 2014-03-31 08:54:53 -0300, Alvaro Herrera wrote: My conclusion here is that some part of the code is failing to examine XMAX_INVALID before looking at the value stored in xmax itself. There ought to be a short-circuit. Fortunately, this bug should be pretty

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-03-31 Thread Andres Freund
On 2014-03-31 09:19:12 -0300, Alvaro Herrera wrote: Andres Freund wrote: On 2014-03-31 08:54:53 -0300, Alvaro Herrera wrote: My conclusion here is that some part of the code is failing to examine XMAX_INVALID before looking at the value stored in xmax itself. There ought to be a

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-03-31 Thread Alvaro Herrera
Andres Freund wrote: On 2014-03-31 09:19:12 -0300, Alvaro Herrera wrote: Andres Freund wrote: On 2014-03-31 08:54:53 -0300, Alvaro Herrera wrote: My conclusion here is that some part of the code is failing to examine XMAX_INVALID before looking at the value stored in xmax itself.

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-03-31 Thread Stephen Frost
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote: I guess I wasn't expecting that too-old values would last longer than a full wraparound cycle. Maybe the right fix is just to have the second check also conditional on allow_old. I don't believe this was a wraparound case. Anyway, it's not

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-03-31 Thread Andres Freund
On 2014-03-31 09:09:08 -0400, Stephen Frost wrote: * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote: I guess I wasn't expecting that too-old values would last longer than a full wraparound cycle. Maybe the right fix is just to have the second check also conditional on allow_old. I

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-03-31 Thread Stephen Frost
Andres, * Andres Freund (and...@2ndquadrant.com) wrote: Without having looked at the code, IIRC this looks like some place misses passing allow_old=true where it's actually required. Any chance you can get a backtrace for the error message? I know you said somewhere below that you'd worked

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-03-31 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote: On 2014-03-31 09:09:08 -0400, Stephen Frost wrote: * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote: I guess I wasn't expecting that too-old values would last longer than a full wraparound cycle. Maybe the right fix is just to have the

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-03-31 Thread Alvaro Herrera
Stephen Frost wrote: Further review leads me to notice that both HEAP_XMAX_IS_MULTI and HEAP_XMAX_INVALID are set: t_infomask | 6528 6528 decimal - 0x1980 0001 1001 1000 Which gives us: 1000 - HEAP_XMAX_LOCK_ONLY 0001 - HEAP_XMIN_COMMITTED

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-03-31 Thread Alvaro Herrera
Andres Freund wrote: On 2014-03-31 08:54:53 -0300, Alvaro Herrera wrote: My conclusion here is that some part of the code is failing to examine XMAX_INVALID before looking at the value stored in xmax itself. There ought to be a short-circuit. Fortunately, this bug should be pretty

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-03-31 Thread Alvaro Herrera
Alvaro Herrera wrote: Andres Freund wrote: On 2014-03-31 08:54:53 -0300, Alvaro Herrera wrote: My conclusion here is that some part of the code is failing to examine XMAX_INVALID before looking at the value stored in xmax itself. There ought to be a short-circuit. Fortunately, this

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-03-31 Thread Stephen Frost
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote: I think this rule is wrong. I think the rule ought to be something like if the XMAX_INVALID bit is set, then reset whatever is there if there is something; if the bit is not set, proceed as today. Otherwise we risk reading garbage, which

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-03-30 Thread Stephen Frost
All, * Stephen Frost (sfr...@snowman.net) wrote: Looks like we might not be entirely out of the woods yet regarding MultiXactId's. After doing an upgrade from 9.2.6 to 9.3.4, we saw the following: ERROR: MultiXactId 6849409 has not been created yet -- apparent wraparound While

Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-03-30 Thread Stephen Frost
* Stephen Frost (sfr...@snowman.net) wrote: I have the pre-upgrade database and can upgrade/rollback/etc that pretty easily. Note that the table contents weren't changed during the upgrade, of course, and so the 9.2.6 instance has HEAP_XMAX_IS_MULTI set while t_xmax is 6849409 for the tuple