Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-23 Thread Michael Paquier
On Sat, Dec 23, 2017 at 05:26:22PM -0300, Alvaro Herrera wrote:
> I noticed that I'm committer for this patch in the commitfest, though I
> don't remember setting that.  Are you expecting me to commit it?  I
> thought you'd do it, but if you want me to assume the responsibility I
> can do that.

I tend to create CF entries for all patches that we need to track as bug
fixes. No things lost this way. Alvaro, you have been marked as a committer
of this patch as you were the one to work and push the first version that
has finished reverted.
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-23 Thread Alvaro Herrera
Andres Freund wrote:

> On December 23, 2017 9:26:22 PM GMT+01:00, Alvaro Herrera 
>  wrote:
> >I noticed that I'm committer for this patch in the commitfest, though I
> >don't remember setting that.  Are you expecting me to commit it?  I
> >thought you'd do it, but if you want me to assume the responsibility I
> >can do that.
> 
> I thought I pushed it to all branches? Do you see anything missing?
> Didn't know there's a CF entry...

Ah, no, looks correct to me in all branches.  Updated the CF now too.

Thanks!

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-23 Thread Andres Freund


On December 23, 2017 9:26:22 PM GMT+01:00, Alvaro Herrera 
 wrote:
>I noticed that I'm committer for this patch in the commitfest, though I
>don't remember setting that.  Are you expecting me to commit it?  I
>thought you'd do it, but if you want me to assume the responsibility I
>can do that.

I thought I pushed it to all branches? Do you see anything missing? Didn't know 
there's a CF entry...

Andres

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-23 Thread Alvaro Herrera
I noticed that I'm committer for this patch in the commitfest, though I
don't remember setting that.  Are you expecting me to commit it?  I
thought you'd do it, but if you want me to assume the responsibility I
can do that.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-15 Thread Andres Freund
On 2017-12-15 11:15:47 -0800, Peter Geoghegan wrote:
> On Fri, Dec 15, 2017 at 11:03 AM, Peter Geoghegan  wrote:
> > The elog(), which was itself upgraded from a simple Assert by commit
> > d70cf811, appears in exactly the same form in 9.3+. Things did change
> > there, but they were kept in sync.
> 
> BTW, if you're going to do it, I would target the similar error within
> validate_index_heapscan(), too. That was also added by d70cf811.

Please send a patch for master on a *new* thread.

Greetings,

Andres Freund



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-15 Thread Peter Geoghegan
On Fri, Dec 15, 2017 at 11:03 AM, Peter Geoghegan  wrote:
> The elog(), which was itself upgraded from a simple Assert by commit
> d70cf811, appears in exactly the same form in 9.3+. Things did change
> there, but they were kept in sync.

BTW, if you're going to do it, I would target the similar error within
validate_index_heapscan(), too. That was also added by d70cf811.

-- 
Peter Geoghegan



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-15 Thread Peter Geoghegan
On Fri, Dec 15, 2017 at 10:58 AM, Andres Freund  wrote:
>> I have one minor piece of feedback on the upgrading of assertions to
>> ereport()s with ERRCODE_DATA_CORRUPTION: It would be nice if you could
>> upgrade the raw elog() "can't happen" error within
>> IndexBuildHeapRangeScan() to be an ereport() with
>> ERRCODE_DATA_CORRUPTION. I'm referring to the "failed to find parent
>> tuple for heap-only tuple" error, which I think merits being a real
>> user-visible error, just like the relfrozenxid/relminmxid tests are
>> now. As you know, the enhanced amcheck will sometimes detect
>> corruption due to this bug by hitting that error.
>
> I'm not opposed to that, it just seems independent from this thread. Not
> sure I really want to go around and backpatch such a change, that code
> has changed a bit between branches. Happy to do so on master.

The elog(), which was itself upgraded from a simple Assert by commit
d70cf811, appears in exactly the same form in 9.3+. Things did change
there, but they were kept in sync.

-- 
Peter Geoghegan



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-15 Thread Andres Freund
On 2017-12-15 10:46:05 -0800, Peter Geoghegan wrote:
> On Thu, Dec 14, 2017 at 6:30 PM, Andres Freund  wrote:
> > Pushed this way.  Moved some more relfrozenxid/relminmxid tests outside
> > of the cutoff changes, polished some error messages.
> >
> >
> > Alvaro, Michael, Peter, and everyone else I'd greatly appreciate if you
> > could have a look at the backported version, just about everything but
> > v10 had conflicts, some of them not insubstantial.
> 
> I have one minor piece of feedback on the upgrading of assertions to
> ereport()s with ERRCODE_DATA_CORRUPTION: It would be nice if you could
> upgrade the raw elog() "can't happen" error within
> IndexBuildHeapRangeScan() to be an ereport() with
> ERRCODE_DATA_CORRUPTION. I'm referring to the "failed to find parent
> tuple for heap-only tuple" error, which I think merits being a real
> user-visible error, just like the relfrozenxid/relminmxid tests are
> now. As you know, the enhanced amcheck will sometimes detect
> corruption due to this bug by hitting that error.

I'm not opposed to that, it just seems independent from this thread. Not
sure I really want to go around and backpatch such a change, that code
has changed a bit between branches. Happy to do so on master.

Greetings,

Andres Freund



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-15 Thread Andres Freund
On 2017-12-15 20:25:22 +0900, Michael Paquier wrote:
> On Fri, Dec 15, 2017 at 11:30 AM, Andres Freund  wrote:
> > Alvaro, Michael, Peter, and everyone else I'd greatly appreciate if you
> > could have a look at the backported version, just about everything but
> > v10 had conflicts, some of them not insubstantial.
> 
> I have gone through the backpatched versions for the fixes in tuple
> pruning, running some tests on the way and those look good to me.

Thanks.


> I have not taken the time to go through the ones changing the
> assertions to ereport() though...

Those were the ones with a lot of conflicts tho - I'd temporarily broken
freezing for 9.3, but both review and testing caught it...

Greetings,

Andres Freund



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-15 Thread Peter Geoghegan
On Thu, Dec 14, 2017 at 6:30 PM, Andres Freund  wrote:
> Pushed this way.  Moved some more relfrozenxid/relminmxid tests outside
> of the cutoff changes, polished some error messages.
>
>
> Alvaro, Michael, Peter, and everyone else I'd greatly appreciate if you
> could have a look at the backported version, just about everything but
> v10 had conflicts, some of them not insubstantial.

I have one minor piece of feedback on the upgrading of assertions to
ereport()s with ERRCODE_DATA_CORRUPTION: It would be nice if you could
upgrade the raw elog() "can't happen" error within
IndexBuildHeapRangeScan() to be an ereport() with
ERRCODE_DATA_CORRUPTION. I'm referring to the "failed to find parent
tuple for heap-only tuple" error, which I think merits being a real
user-visible error, just like the relfrozenxid/relminmxid tests are
now. As you know, the enhanced amcheck will sometimes detect
corruption due to this bug by hitting that error.

It would be nice if we could tighten up the number of errcodes that
can be involved in an error that amcheck detects. I know that elog()
implicitly has an errcode, that could be included in the errcodes to
check when verification occurs in an automated fashion across a large
number of databases. However, this is a pretty esoteric point, and I'd
prefer to just try to limit it to ERRCODE_DATA_CORRUPTION and
ERRCODE_INDEX_CORRUPTED, insofar as that's practical. When we ran
amcheck on the Heroku fleet, back when I worked there, there were all
kinds of non-interesting errors that could occur that needed to be
filtered out. I want to try to make that process somewhat less painful.

-- 
Peter Geoghegan



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-14 Thread Andres Freund
On 2017-12-14 17:00:29 -0800, Andres Freund wrote:
> Hi,
> 
> On 2017-11-13 19:03:41 -0800, Andres Freund wrote:
> > diff --git a/src/backend/access/heap/rewriteheap.c 
> > b/src/backend/access/heap/rewriteheap.c
> > index f93c194e182..7d163c91379 100644
> > --- a/src/backend/access/heap/rewriteheap.c
> > +++ b/src/backend/access/heap/rewriteheap.c
> > @@ -407,7 +407,10 @@ rewrite_heap_tuple(RewriteState state,
> >  * While we have our hands on the tuple, we may as well freeze any
> >  * eligible xmin or xmax, so that future VACUUM effort can be saved.
> >  */
> > -   heap_freeze_tuple(new_tuple->t_data, state->rs_freeze_xid,
> > +   heap_freeze_tuple(new_tuple->t_data,
> > + 
> > state->rs_old_rel->rd_rel->relfrozenxid,
> > + state->rs_old_rel->rd_rel->relminmxid,
> > + state->rs_freeze_xid,
> >   state->rs_cutoff_multi);
> 
> Hm. So this requires backpatching the introduction of
> RewriteStateData->rs_old_rel into 9.3, which in turn requires a new
> argument to begin_heap_rewrite().  It originally was added in the
> logical decoding commit (i.e. 9.4).
> 
> I'm fine with that, but it could theoretically cause issues for somebody
> with an extension that calls begin_heap_rewrite() - which seems unlikely
> and I couldn't find any that does so.
> 
> Does anybody have a problem with that?

Pushed this way.  Moved some more relfrozenxid/relminmxid tests outside
of the cutoff changes, polished some error messages.


Alvaro, Michael, Peter, and everyone else I'd greatly appreciate if you
could have a look at the backported version, just about everything but
v10 had conflicts, some of them not insubstantial.

Greetings,

Andres Freund



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-14 Thread Andres Freund
Hi,

On 2017-11-13 19:03:41 -0800, Andres Freund wrote:
> diff --git a/src/backend/access/heap/rewriteheap.c 
> b/src/backend/access/heap/rewriteheap.c
> index f93c194e182..7d163c91379 100644
> --- a/src/backend/access/heap/rewriteheap.c
> +++ b/src/backend/access/heap/rewriteheap.c
> @@ -407,7 +407,10 @@ rewrite_heap_tuple(RewriteState state,
>* While we have our hands on the tuple, we may as well freeze any
>* eligible xmin or xmax, so that future VACUUM effort can be saved.
>*/
> - heap_freeze_tuple(new_tuple->t_data, state->rs_freeze_xid,
> + heap_freeze_tuple(new_tuple->t_data,
> +   
> state->rs_old_rel->rd_rel->relfrozenxid,
> +   state->rs_old_rel->rd_rel->relminmxid,
> +   state->rs_freeze_xid,
> state->rs_cutoff_multi);

Hm. So this requires backpatching the introduction of
RewriteStateData->rs_old_rel into 9.3, which in turn requires a new
argument to begin_heap_rewrite().  It originally was added in the
logical decoding commit (i.e. 9.4).

I'm fine with that, but it could theoretically cause issues for somebody
with an extension that calls begin_heap_rewrite() - which seems unlikely
and I couldn't find any that does so.

Does anybody have a problem with that?

Regards,

Andres



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-14 Thread Andres Freund
On 2017-12-07 18:32:51 +0900, Michael Paquier wrote:
> On Thu, Dec 7, 2017 at 5:23 AM, Alvaro Herrera  
> wrote:
> > Looking at 0002: I agree with the stuff being done here.
> 
> The level of details you are providing with a proper error code is an
> improvement over the first version proposed in my opinion.
> 
> > I think a
> > couple of these checks could be moved one block outerwards in term of
> > scope; I don't see any reason why the check should not apply in that
> > case. I didn't catch any place missing additional checks.
> 
> In FreezeMultiXactId() wouldn't it be better to issue an error as well
> for this assertion?
> Assert(!TransactionIdPrecedes(members[i].xid, cutoff_xid));

I'm not really concerned that much about pure lockers, they don't cause
permanent data corruption...


> > Despite these being "shouldn't happen" conditions, I think we should
> > turn these up all the way to ereports with an errcode and all, and also
> > report the XIDs being complained about.  No translation required,
> > though.  Other than those changes and minor copy editing a commit
> > (attached), 0002 looks good to me.

If you want to go around doing that in some more places we can do so in
master only...


> +   if (!(tuple->t_infomask & HEAP_XMAX_LOCK_ONLY) &&
> +   TransactionIdDidCommit(xid))
> +   ereport(ERROR,
> +   (errcode(ERRCODE_DATA_CORRUPTED),
> +errmsg("can't freeze committed xmax %u", xid)));
> The usual wording used in errmsg is not the "can't" but "cannot".
> 
> +   ereport(ERROR,
> +   (errcode(ERRCODE_DATA_CORRUPTED),
> +errmsg_internal("uncommitted Xmin %u from
> before xid cutoff %u needs to be frozen",
> +xid, cutoff_xid)));
> "Xmin" I have never seen, but "xmin" I did.

Changed...

Greetings,

Andres Freund



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-07 Thread Andres Freund
On 2017-12-07 17:41:56 -0300, Alvaro Herrera wrote:
> > > Looking at 0002: I agree with the stuff being done here.  I think a
> > > couple of these checks could be moved one block outerwards in term of
> > > scope; I don't see any reason why the check should not apply in that
> > > case.  I didn't catch any place missing additional checks.
> > 
> > I think I largely put them into the inner blocks because they were
> > guaranteed to be reached in those case (the horizon has to be before the
> > cutoff etc), and that way additional branches are avoided.
> 
> Hmm, it should be possible to call vacuum with a very low freeze_min_age
> (which sets a very recent relfrozenxid), then shortly thereafter call it
> with a large one, no?  So it's not really guaranteed ...

Fair point!

- Andres



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-07 Thread Alvaro Herrera
Hello,

Andres Freund wrote:

> On 2017-12-06 17:23:55 -0300, Alvaro Herrera wrote:
> > > I've played around quite some with the attached patch. So far, after
> > > applying the second patch, neither VACUUM nor VACUUM FULL / CLUSTER make
> > > the situation worse for already existing corruption. HOT pruning can
> > > change the exact appearance of existing corruption a bit, but I don't
> > > think it can make the corruption meaningfully worse.  It's a bit
> > > annoying and scary to add so many checks to backbranches but it kinda
> > > seems required.  The error message texts aren't perfect, but these are
> > > "should never be hit" type elog()s so I'm not too worried about that.
> > 
> > Looking at 0002: I agree with the stuff being done here.  I think a
> > couple of these checks could be moved one block outerwards in term of
> > scope; I don't see any reason why the check should not apply in that
> > case.  I didn't catch any place missing additional checks.
> 
> I think I largely put them into the inner blocks because they were
> guaranteed to be reached in those case (the horizon has to be before the
> cutoff etc), and that way additional branches are avoided.

Hmm, it should be possible to call vacuum with a very low freeze_min_age
(which sets a very recent relfrozenxid), then shortly thereafter call it
with a large one, no?  So it's not really guaranteed ...


> > Despite these being "shouldn't happen" conditions, I think we should
> > turn these up all the way to ereports with an errcode and all, and also
> > report the XIDs being complained about.  No translation required,
> > though.  Other than those changes and minor copy editing a commit
> > (attached), 0002 looks good to me.
> 
> Hm, I don't really care one way or another. I do see that you used
> errmsg() in some places, errmsg_internal() in others. Was that
> intentional?

Eh, no, my intention was to make these all errmsg_internal() to avoid
translation (serves no purpose here).  Feel free to update the remaining
ones.


> > I started thinking it'd be good to report block number whenever anything
> > happened while scanning the relation.  The best way to go about this
> > seems to be to add an errcontext callback to lazy_scan_heap, so I attach
> > a WIP untested patch to add that.  (I'm not proposing this for
> > back-patch for now, mostly because I don't have the time/energy to push
> > for it right now.)
> 
> That seems like a good idea. There's some cases where that could
> increase log spam noticeably (unitialized blocks), but that seems
> acceptable.

Yeah, I noticed that and I agree it seems ok.

> > +   if (info->blkno != InvalidBlockNumber)
> > +   errcontext("while scanning page %u of relation %s",
> > +  info->blkno, 
> > RelationGetRelationName(info->relation));
> > +   else
> > +   errcontext("while vacuuming relation %s",
> > +  RelationGetRelationName(info->relation));
> 
> Hm, perhaps rephrase so both messages refer to vacuuming? E.g. just by
> replacing scanning with vacuuming?

Makes sense.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-07 Thread Andres Freund
On 2017-12-07 12:08:38 +0900, Michael Paquier wrote:
> > Your commit message does a poor job of acknowledging prior work on
> > diagnosing the problem starting from Dan's initial test case and patch.
> 
> (Nit: I have extracted from the test case of Dan an isolation test,
> which Andres has reduced to a subset of permutations. Peter G. also
> complained about what is visibly the same bug we are discussing here
> but without a test case.)

The previous versions of the test case didn't actually hit the real
issues, so I don't think that matter much.

Greetings,

Andres Freund



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-07 Thread Andres Freund
Hi,

On 2017-12-06 17:23:55 -0300, Alvaro Herrera wrote:
> > I've played around quite some with the attached patch. So far, after
> > applying the second patch, neither VACUUM nor VACUUM FULL / CLUSTER make
> > the situation worse for already existing corruption. HOT pruning can
> > change the exact appearance of existing corruption a bit, but I don't
> > think it can make the corruption meaningfully worse.  It's a bit
> > annoying and scary to add so many checks to backbranches but it kinda
> > seems required.  The error message texts aren't perfect, but these are
> > "should never be hit" type elog()s so I'm not too worried about that.
> 
> Looking at 0002: I agree with the stuff being done here.  I think a
> couple of these checks could be moved one block outerwards in term of
> scope; I don't see any reason why the check should not apply in that
> case.  I didn't catch any place missing additional checks.

I think I largely put them into the inner blocks because they were
guaranteed to be reached in those case (the horizon has to be before the
cutoff etc), and that way additional branches are avoided.


> Despite these being "shouldn't happen" conditions, I think we should
> turn these up all the way to ereports with an errcode and all, and also
> report the XIDs being complained about.  No translation required,
> though.  Other than those changes and minor copy editing a commit
> (attached), 0002 looks good to me.

Hm, I don't really care one way or another. I do see that you used
errmsg() in some places, errmsg_internal() in others. Was that
intentional?

> I started thinking it'd be good to report block number whenever anything
> happened while scanning the relation.  The best way to go about this
> seems to be to add an errcontext callback to lazy_scan_heap, so I attach
> a WIP untested patch to add that.  (I'm not proposing this for
> back-patch for now, mostly because I don't have the time/energy to push
> for it right now.)

That seems like a good idea. There's some cases where that could
increase log spam noticeably (unitialized blocks), but that seems
acceptable.


> +static void
> +lazy_scan_heap_cb(void *arg)
> +{
> + LazyScanHeapInfo *info = (LazyScanHeapInfo *) arg;
> +
> + if (info->blkno != InvalidBlockNumber)
> + errcontext("while scanning page %u of relation %s",
> +info->blkno, 
> RelationGetRelationName(info->relation));
> + else
> + errcontext("while vacuuming relation %s",
> +RelationGetRelationName(info->relation));
> +}

Hm, perhaps rephrase so both messages refer to vacuuming? E.g. just by
replacing scanning with vacuuming?

Greetings,

Andres Freund



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-07 Thread Andres Freund
Hi,

On 2017-12-06 13:21:15 -0300, Alvaro Herrera wrote:
> I think you've done a stellar job of identifying what the actual problem
> was.  I like the new (simpler) coding of that portion of
> HeapTupleSatisfiesVacuum.

Thanks!

> freeze-the-dead is not listed in isolation_schedule; an easy fix.

Yea, I'd sent an update about that, stupidly forgot git amend the
commit...


> I confirm that the test crashes with an assertion failure without the
> code fix, and that it doesn't with it.
> 
> I think the comparison to OldestXmin should be reversed:
> 
>   if (!TransactionIdPrecedes(xmax, OldestXmin))
>   return HEAPTUPLE_RECENTLY_DEAD;
> 
>   return HEAPTUPLE_DEAD;
> 
> This way, an xmax that has exactly the OldestXmin value will return
> RECENTLY_DEAD rather DEAD, which seems reasonable to me (since
> OldestXmin value itself is supposed to be still possibly visible to
> somebody).

Yes, I think you're right. That's a bug.


> Your commit message does a poor job of acknowledging prior work on
> diagnosing the problem starting from Dan's initial test case and patch.

Yea, you're right. I was writing it with 14h of jetlag, apparently that
does something to your brain...

Greetings,

Andres Freund



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-07 Thread Michael Paquier
On Thu, Dec 7, 2017 at 5:23 AM, Alvaro Herrera  wrote:
> Looking at 0002: I agree with the stuff being done here.

The level of details you are providing with a proper error code is an
improvement over the first version proposed in my opinion.

> I think a
> couple of these checks could be moved one block outerwards in term of
> scope; I don't see any reason why the check should not apply in that
> case. I didn't catch any place missing additional checks.

In FreezeMultiXactId() wouldn't it be better to issue an error as well
for this assertion?
Assert(!TransactionIdPrecedes(members[i].xid, cutoff_xid));

> Despite these being "shouldn't happen" conditions, I think we should
> turn these up all the way to ereports with an errcode and all, and also
> report the XIDs being complained about.  No translation required,
> though.  Other than those changes and minor copy editing a commit
> (attached), 0002 looks good to me.

+   if (!(tuple->t_infomask & HEAP_XMAX_LOCK_ONLY) &&
+   TransactionIdDidCommit(xid))
+   ereport(ERROR,
+   (errcode(ERRCODE_DATA_CORRUPTED),
+errmsg("can't freeze committed xmax %u", xid)));
The usual wording used in errmsg is not the "can't" but "cannot".

+   ereport(ERROR,
+   (errcode(ERRCODE_DATA_CORRUPTED),
+errmsg_internal("uncommitted Xmin %u from
before xid cutoff %u needs to be frozen",
+xid, cutoff_xid)));
"Xmin" I have never seen, but "xmin" I did.

> I started thinking it'd be good to report block number whenever anything
> happened while scanning the relation.  The best way to go about this
> seems to be to add an errcontext callback to lazy_scan_heap, so I attach
> a WIP untested patch to add that.  (I'm not proposing this for
> back-patch for now, mostly because I don't have the time/energy to push
> for it right now.)

I would recommend to start a new thread and to add that patch to the
next commit fest as you would get more visibility and input from other
folks on -hackers. It looks like a good idea to me.
-- 
Michael



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-06 Thread Michael Paquier
On Thu, Dec 7, 2017 at 1:21 AM, Alvaro Herrera  wrote:
> Put together, I propose the attached delta for 0001.

I have been looking at Andres' 0001 and your tweaks here for some time
since yesterday...

I have also performed sanity checks using all the scripts that have
accumulated on my archives for this stuff. This looks solid to me. I
have not seen failures with broken hot chains, REINDEX, etc.

> This way, an xmax that has exactly the OldestXmin value will return
> RECENTLY_DEAD rather DEAD, which seems reasonable to me (since
> OldestXmin value itself is supposed to be still possibly visible to
> somebody).  Also, this way it is consistent with the other comparison to
> OldestXmin at the bottom of the function.  There is no reason for the
> "else" or the extra braces.

+1. It would be nice to add a comment in the patched portion
mentioning that the new code had better match what is at the bottom of
the function.

+   else if (!MultiXactIdIsRunning(HeapTupleHeaderGetRawXmax(tuple), false))
+   {
+   /*
+* Not in Progress, Not Committed, so either Aborted or crashed.
+* Mark the Xmax as invalid.
+*/
+   SetHintBits(tuple, buffer, HEAP_XMAX_INVALID, InvalidTransactionId);
}

-   /*
-* Not in Progress, Not Committed, so either Aborted or crashed.
-* Remove the Xmax.
-*/
-   SetHintBits(tuple, buffer, HEAP_XMAX_INVALID, InvalidTransactionId);
return HEAPTUPLE_LIVE;

I would find cleaner if the last "else if" is put into its own
separate if condition, and that for a multixact still running this
refers to an updating transaction aborted so hint bits are not set.

> Your commit message does a poor job of acknowledging prior work on
> diagnosing the problem starting from Dan's initial test case and patch.

(Nit: I have extracted from the test case of Dan an isolation test,
which Andres has reduced to a subset of permutations. Peter G. also
complained about what is visibly the same bug we are discussing here
but without a test case.)
-- 
Michael



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-06 Thread Alvaro Herrera
Andres Freund wrote:

> I've played around quite some with the attached patch. So far, after
> applying the second patch, neither VACUUM nor VACUUM FULL / CLUSTER make
> the situation worse for already existing corruption. HOT pruning can
> change the exact appearance of existing corruption a bit, but I don't
> think it can make the corruption meaningfully worse.  It's a bit
> annoying and scary to add so many checks to backbranches but it kinda
> seems required.  The error message texts aren't perfect, but these are
> "should never be hit" type elog()s so I'm not too worried about that.

Looking at 0002: I agree with the stuff being done here.  I think a
couple of these checks could be moved one block outerwards in term of
scope; I don't see any reason why the check should not apply in that
case.  I didn't catch any place missing additional checks.

Despite these being "shouldn't happen" conditions, I think we should
turn these up all the way to ereports with an errcode and all, and also
report the XIDs being complained about.  No translation required,
though.  Other than those changes and minor copy editing a commit
(attached), 0002 looks good to me.

I started thinking it'd be good to report block number whenever anything
happened while scanning the relation.  The best way to go about this
seems to be to add an errcontext callback to lazy_scan_heap, so I attach
a WIP untested patch to add that.  (I'm not proposing this for
back-patch for now, mostly because I don't have the time/energy to push
for it right now.)

It appears that you got all the places that seem to reasonably need
additional checks.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 9ac665638c86460f0b767628203f8bf28df35e49 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Wed, 6 Dec 2017 16:44:25 -0300
Subject: [PATCH 1/2] tweaks for 0002

---
 src/backend/access/heap/heapam.c  | 67 +++
 src/backend/commands/vacuumlazy.c |  6 ++--
 2 files changed, 50 insertions(+), 23 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index eb93718baa..5c284a4c32 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -6387,17 +6387,23 @@ FreezeMultiXactId(MultiXactId multi, uint16 t_infomask,
else if (MultiXactIdPrecedes(multi, cutoff_multi))
{
if (MultiXactIdPrecedes(multi, relminmxid))
-   elog(ERROR, "encountered multixact from before 
horizon");
+   ereport(ERROR,
+   (errcode(ERRCODE_DATA_CORRUPTED),
+errmsg_internal("found multixact %u 
from before relminmxid %u",
+multi, 
relminmxid)));
 
/*
-* This old multi cannot possibly have members still running.  
If it
-* was a locker only, it can be removed without any further
-* consideration; but if it contained an update, we might need 
to
-* preserve it.
+* This old multi cannot possibly have members still running, 
but
+* verify just in case.  If it was a locker only, it can be 
removed
+* without any further consideration; but if it contained an 
update, we
+* might need to preserve it.
 */
if (MultiXactIdIsRunning(multi,
 
HEAP_XMAX_IS_LOCKED_ONLY(t_infomask)))
-   elog(ERROR, "multixact from before cutoff found to be 
still running");
+   ereport(ERROR,
+   (errcode(ERRCODE_DATA_CORRUPTED),
+errmsg_internal("multixact %u from 
before cutoff %u found to be still running",
+multi, 
cutoff_multi)));
 
if (HEAP_XMAX_IS_LOCKED_ONLY(t_infomask))
{
@@ -6419,9 +6425,14 @@ FreezeMultiXactId(MultiXactId multi, uint16 t_infomask,
if (TransactionIdPrecedes(xid, cutoff_xid))
{
if (TransactionIdPrecedes(xid, relfrozenxid))
-   elog(ERROR, "encountered xid from 
before horizon");
+   ereport(ERROR,
+   
(errcode(ERRCODE_DATA_CORRUPTED),
+errmsg_internal("found 
xid %u from before relfrozenxid %u",
+   
 xid, relfrozenxid)));
if 

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-06 Thread Alvaro Herrera
I think you've done a stellar job of identifying what the actual problem
was.  I like the new (simpler) coding of that portion of
HeapTupleSatisfiesVacuum.

freeze-the-dead is not listed in isolation_schedule; an easy fix.
I confirm that the test crashes with an assertion failure without the
code fix, and that it doesn't with it.

I think the comparison to OldestXmin should be reversed:

if (!TransactionIdPrecedes(xmax, OldestXmin))
return HEAPTUPLE_RECENTLY_DEAD;

return HEAPTUPLE_DEAD;

This way, an xmax that has exactly the OldestXmin value will return
RECENTLY_DEAD rather DEAD, which seems reasonable to me (since
OldestXmin value itself is supposed to be still possibly visible to
somebody).  Also, this way it is consistent with the other comparison to
OldestXmin at the bottom of the function.  There is no reason for the
"else" or the extra braces.

Put together, I propose the attached delta for 0001.

Your commit message does a poor job of acknowledging prior work on
diagnosing the problem starting from Dan's initial test case and patch.

I haven't looked at your 0002 yet.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 8be2980116..aab03835d1 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -1324,25 +1324,23 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId 
OldestXmin,
else if (TransactionIdDidCommit(xmax))
{
/*
-* The multixact might still be running due to lockers. 
If the
-* updater is below the horizon we have to return DEAD 
regardless
-* - otherwise we could end up with a tuple where the 
updater has
-* to be removed due to the horizon, but is not pruned 
away. It's
-* not a problem to prune that tuple because all the 
lockers will
-* also be present in the newer tuple version.
+* The multixact might still be running due to lockers. 
 If the
+* updater is below the xid horizon, we have to return 
DEAD
+* regardless -- otherwise we could end up with a tuple 
where the
+* updater has to be removed due to the horizon, but is 
not pruned
+* away.  It's not a problem to prune that tuple, 
because any
+* remaining lockers will also be present in newer 
tuple versions.
 */
-   if (TransactionIdPrecedes(xmax, OldestXmin))
-   {
-   return HEAPTUPLE_DEAD;
-   }
-   else
+   if (!TransactionIdPrecedes(xmax, OldestXmin))
return HEAPTUPLE_RECENTLY_DEAD;
+
+   return HEAPTUPLE_DEAD;
}
else if 
(!MultiXactIdIsRunning(HeapTupleHeaderGetRawXmax(tuple), false))
{
/*
 * Not in Progress, Not Committed, so either Aborted or 
crashed.
-* Remove the Xmax.
+* Mark the Xmax as invalid.
 */
SetHintBits(tuple, buffer, HEAP_XMAX_INVALID, 
InvalidTransactionId);
}
diff --git a/src/test/isolation/isolation_schedule 
b/src/test/isolation/isolation_schedule
index e41b9164cd..eb566ebb6c 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -44,6 +44,7 @@ test: update-locked-tuple
 test: propagate-lock-delete
 test: tuplelock-conflict
 test: tuplelock-update
+test: freeze-the-dead
 test: nowait
 test: nowait-2
 test: nowait-3


Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-12-06 Thread Michael Paquier
On Wed, Dec 6, 2017 at 5:03 PM, Andres Freund  wrote:
> Ping.  I'm a bit surprised that a bug fixing a significant data
> corruption issue has gotten no reviews at all.

Note that I was planning to look at this problem today and tomorrow my
time, getting stuck for CF handling last week and conference this
week. If you think that helps, I'll be happy to help at the extent of
what I can do.
-- 
Michael



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-20 Thread Andres Freund
Hi,

On 2017-11-13 19:03:41 -0800, Andres Freund wrote:
> Hi,
> 
> On 2017-11-03 07:53:30 -0700, Andres Freund wrote:
> > Here's that patch.  I've stared at this some, and Robert did too. Robert
> > mentioned that the commit message might need some polish and I'm not
> > 100% sure about the error message texts yet.
> > 
> > I'm not yet convinced that the new elog in vacuumlazy can never trigger
> > - but I also don't think we want to actually freeze the tuple in that
> > case.
> 
> I'm fairly sure it could be triggered, therefore I've rewritten that.
> 
> I've played around quite some with the attached patch. So far, after
> applying the second patch, neither VACUUM nor VACUUM FULL / CLUSTER make
> the situation worse for already existing corruption. HOT pruning can
> change the exact appearance of existing corruption a bit, but I don't
> think it can make the corruption meaningfully worse.  It's a bit
> annoying and scary to add so many checks to backbranches but it kinda
> seems required.  The error message texts aren't perfect, but these are
> "should never be hit" type elog()s so I'm not too worried about that.
> 
> 
> Please review!

Ping? Alvaro, it'd be good to get some input here.

- Andres



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-13 Thread Andres Freund
Hi,

On 2017-11-03 07:53:30 -0700, Andres Freund wrote:
> Here's that patch.  I've stared at this some, and Robert did too. Robert
> mentioned that the commit message might need some polish and I'm not
> 100% sure about the error message texts yet.
> 
> I'm not yet convinced that the new elog in vacuumlazy can never trigger
> - but I also don't think we want to actually freeze the tuple in that
> case.

I'm fairly sure it could be triggered, therefore I've rewritten that.

I've played around quite some with the attached patch. So far, after
applying the second patch, neither VACUUM nor VACUUM FULL / CLUSTER make
the situation worse for already existing corruption. HOT pruning can
change the exact appearance of existing corruption a bit, but I don't
think it can make the corruption meaningfully worse.  It's a bit
annoying and scary to add so many checks to backbranches but it kinda
seems required.  The error message texts aren't perfect, but these are
"should never be hit" type elog()s so I'm not too worried about that.


Please review!


One thing I'm wondering about is whether we have any way for users to
diagnose whether they need to dump & restore - I can't really think of
anything actually meaningful. Reindexing will find some instances, but
far from all. VACUUM (disable_page_skipping, freeze) pg_class will
also find a bunch.  Not a perfect story.


> Staring at the vacuumlazy hunk I think I might have found a related bug:
> heap_update_tuple() just copies the old xmax to the new tuple's xmax if
> a multixact and still running.  It does so without verifying liveliness
> of members.  Isn't that buggy? Consider what happens if we have three
> blocks: 1 has free space, two is being vacuumed and is locked, three is
> full and has a tuple that's key share locked by a live tuple and is
> updated by a dead xmax from before the xmin horizon. In that case afaict
> the multi will be copied from the third page to the first one.  Which is
> quite bad, because vacuum already processed it, and we'll set
> relfrozenxid accordingly.  I hope I'm missing something here?

Trying to write a testcase for that now.

Greetings,

Andres Freund
>From 6d9f0b60da30aca9bf5eaab814b1d5d7f1ccb8c4 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Fri, 3 Nov 2017 07:52:29 -0700
Subject: [PATCH 1/2] Fix pruning of locked and updated tuples.

Previously it was possible that a tuple was not pruned during vacuum,
even though it's update xmax (i.e. the updating xid in a multixact
with both key share lockers and an updater) was below the cutoff
horizon.

As the freezing code assumed, rightly so, that that's not supposed to
happen, xmax would be preserved (as a member of a new multixact or
xmax directly). That causes two problems: For one the tuple is below
the xmin horizon, which can cause problems if the clog is truncated or
once there's an xid wraparound. The bigger problem is that that will
break HOT chains, which in turn can lead two to breakages: First,
failing index lookups, which in turn can e.g lead to constraints being
violated. Second, future hot prunes / vacuums can end up making
invisible tuples visible again. There's other harmful scenarios.

Fix the problem by recognizing that tuples can be DEAD instead of
RECENTLY_DEAD, even if the multixactid has alive members, if the
update_xid is below the xmin horizon. That's safe because newer
versions of the tuple will contain the locking xids.

This commit also rewrites, and activates, a previously added
regression test to actually showcase corruption without the fix
applied.

A followup commit will harden the code somewhat against future similar
bugs and already corrupted data.

Author: Andres Freund
Per-Discussion: Robert Haas and Freund
Reported-By: Daniel Wood
Discussion:
https://postgr.es/m/e5711e62-8fdf-4dca-a888-c200bf6b5...@amazon.com
https://postgr.es/m/20171102112019.33wb7g5wp4zpj...@alap3.anarazel.de
Backpatch: 9.3-
---
 src/backend/utils/time/tqual.c  |  61 ++
 src/test/isolation/expected/freeze-the-dead.out | 107 +---
 src/test/isolation/specs/freeze-the-dead.spec   |  36 +++-
 3 files changed, 82 insertions(+), 122 deletions(-)

diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index a821e2eed10..8be2980116c 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -1311,49 +1311,42 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
 
 	if (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
 	{
-		TransactionId xmax;
+		TransactionId xmax = HeapTupleGetUpdateXid(tuple);
 
-		if (MultiXactIdIsRunning(HeapTupleHeaderGetRawXmax(tuple), false))
-		{
-			/* already checked above */
-			Assert(!HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask));
-
-			xmax = HeapTupleGetUpdateXid(tuple);
-
-			/* not LOCKED_ONLY, so it has to have an xmax */
-			Assert(TransactionIdIsValid(xmax));
-
-			if (TransactionIdIsInProgress(xmax))
-return