Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-26 Thread Michael Paquier
On Thu, Apr 27, 2017 at 2:25 AM, Tom Lane  wrote:
> Simon Riggs  writes:
>> I've added code following Michael and Tom's comments to the previous
>> patch. New patch attached.
>
> Couple of minor suggestions:
>
> * Rather than deleting the comment for SubTransSetParent entirely,
> maybe make it say "It's possible that the parent was already recorded.
> However, we should never be asked to change an already-set entry to
> something else."
>
> * In SubTransGetTopmostTransaction, maybe it'd be better to spell
> "TransactionIdFollowsOrEquals" as "!TransactionIdPrecedes" to make
> it look more like the test just above.  Matter of taste though.
>
> * Less a matter of taste is that I think that should be just elog(ERROR);
> there's no good reason to make it FATAL.
>
> * Also, I think there should be a comment there, along the lines of
> "check for reversed linkage to ensure this isn't an infinite loop".

No more comments from here, thanks for working on the patch.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-26 Thread Tom Lane
Simon Riggs  writes:
> I've added code following Michael and Tom's comments to the previous
> patch. New patch attached.

Couple of minor suggestions:

* Rather than deleting the comment for SubTransSetParent entirely,
maybe make it say "It's possible that the parent was already recorded.
However, we should never be asked to change an already-set entry to
something else."

* In SubTransGetTopmostTransaction, maybe it'd be better to spell
"TransactionIdFollowsOrEquals" as "!TransactionIdPrecedes" to make
it look more like the test just above.  Matter of taste though.

* Less a matter of taste is that I think that should be just elog(ERROR);
there's no good reason to make it FATAL.

* Also, I think there should be a comment there, along the lines of
"check for reversed linkage to ensure this isn't an infinite loop".

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-26 Thread Simon Riggs
On 26 April 2017 at 15:28, Tom Lane  wrote:
> Nikhil Sontakke  writes:
>> A SELECT query on the newly promoted master on any of the tables involved
>> in the 2PC hangs. The hang is due to a loop in
>> SubTransGetTopmostTransaction(). Due to incorrect linkages, we get a
>> circular reference in parentxid <-> subxid inducing the infinite loop.
>
> I'm inclined to propose that
>
> (1) SubTransSetParent should contain an assert that
> Assert(TransactionIdFollows(xid, parent));
> There is a similar assertion near one of the call sites, but that's
> obviously too far away from the action.
>
> (2) SubTransGetTopmostTransaction ought to contain an actual runtime
> test and ereport (not just Assert) that the parent XID it got from
> pg_subtrans precedes the child XID that was looked up.  This would
> protect us against similar infinite loops in the future.  Even without
> bugs, such a loop could arise due to corrupted data.

Pretty much what I was thinking.


I've added code following Michael and Tom's comments to the previous
patch. New patch attached.

Passes with and without Nikhil's new test.

I plan to apply both patches tomorrow morning my time to give time for
further comments.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


2PC_rework.v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-26 Thread Tom Lane
Nikhil Sontakke  writes:
> A SELECT query on the newly promoted master on any of the tables involved
> in the 2PC hangs. The hang is due to a loop in
> SubTransGetTopmostTransaction(). Due to incorrect linkages, we get a
> circular reference in parentxid <-> subxid inducing the infinite loop.

I'm inclined to propose that

(1) SubTransSetParent should contain an assert that 
Assert(TransactionIdFollows(xid, parent));
There is a similar assertion near one of the call sites, but that's
obviously too far away from the action.

(2) SubTransGetTopmostTransaction ought to contain an actual runtime
test and ereport (not just Assert) that the parent XID it got from
pg_subtrans precedes the child XID that was looked up.  This would
protect us against similar infinite loops in the future.  Even without
bugs, such a loop could arise due to corrupted data.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-26 Thread Nikhil Sontakke
> I'm suggesting we take the approach that if there is a problem we can
> recreate it as a way of exploring what conditions are required and
> therefore work out the impact. Nikhil Sontakke appears to have
> re-created something, but not quite what I had expected. I think he
> will post here tomorrow with an update for us to discuss.
>
>

So, I reverted commit 0874d4f3e183757ba15a4b3f3bf563e0393dd9c2 to go back
to the earlier bad swapped arguments to SubTransSetParent resulting in
incorrect parent linkages and used the attached TAP test patch.

The test prepares a 2PC with more than 64 subtransactions. It then stops
the master and promotes the standby.

A SELECT query on the newly promoted master on any of the tables involved
in the 2PC hangs. The hang is due to a loop in
SubTransGetTopmostTransaction(). Due to incorrect linkages, we get a
circular reference in parentxid <-> subxid inducing the infinite loop.

Any further DML on these objects which will need to check visibility of
these tuples hangs as well. All unrelated objects and new transactions are
ok AFAICS.

I do not see any data loss, which is good. However tables involved in the
2PC are inaccessible till after a hard restart.

The attached TAP test patch can be considered for commit to test handling
2PC with large subtransactions on promoted standby instances.

Regards,
Nikhils
-- 
 Nikhil Sontakke   http://www.2ndQuadrant.com/
 PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services


subxid_bug_with_test_case_v1.0.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-25 Thread Michael Paquier
On Wed, Apr 26, 2017 at 4:53 AM, Simon Riggs  wrote:
> On 25 April 2017 at 16:28, Tom Lane  wrote:
>> Simon Riggs  writes:
>>> I can't see any reason now why overwriteOK should exist at all. I'm
>>> guessing that the whole "overwriteOK" idea was an incorrect response
>>> to xids appearing where they shouldn't have done because of the
>>> mistake you just corrected. So I will now remove the parameter from
>>> the call.
>>
>> Seems reasonable, but I don't like the logic change you made in
>> SubTransSetParent; you broke the former invariant, for non-Assert
>> builds, that the target pg_subtrans entry is guaranteed to have
>> the correct value on exit.  I do like fixing it to not dirty the
>> page unnecessarily, but I'd suggest that we write it like
>>
>> if (*ptr != parent)
>> {
>> Assert(*ptr == InvalidTransactionId);
>> *ptr = parent;
>> SubTransCtl->shared->page_dirty[slotno] = true;
>> }
>
> OK, thanks. I'll commit that tomorrow.

A couple of comments about the last patch of this thread.

Nit: there is a complain about a trailing whitespace:
$ git diff --check
src/backend/access/transam/twophase.c:2011: trailing whitespace.
+ bool fromdisk,

+* It's possible that SubTransSetParent has been set before, if
+* the prepared transaction generated xid assignment records.
 */
for (i = 0; i < hdr->nsubxacts; i++)
-   SubTransSetParent(subxids[i], xid, true);
+   SubTransSetParent(subxids[i], xid);
You can remove this part in RecoverPreparedTransactions() and just
switch ProcessTwoPhaseBuffer()'s setParent to true to do the same
thing. The existing comment block should be moved before
ProcessTwoPhaseBuffer() call. It is critical to mention that
pg_subtrans is not kept after a restart.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-25 Thread Andres Freund
On 2017-04-25 21:22:44 +0100, Simon Riggs wrote:
> On 24 April 2017 at 19:59, Andres Freund  wrote:
> 
> > I don't think that's generally true.
> 
> If you think that, from a risk perspective it is enough for me to
> continue to investigate and I have been doing that.

I'm not sure it's worth digging it all that much - I think we just
shouldn't claim in the release notes that the bug doesn't have any
consequences, but that it's though to only matter in unlikely corner
cases.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-25 Thread Simon Riggs
On 24 April 2017 at 19:59, Andres Freund  wrote:

> I don't think that's generally true.

If you think that, from a risk perspective it is enough for me to
continue to investigate and I have been doing that.

As I said before I thought I had found a problem.

I'm suggesting we take the approach that if there is a problem we can
recreate it as a way of exploring what conditions are required and
therefore work out the impact. Nikhil Sontakke appears to have
re-created something, but not quite what I had expected. I think he
will post here tomorrow with an update for us to discuss.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-25 Thread Simon Riggs
On 25 April 2017 at 16:28, Tom Lane  wrote:
> Simon Riggs  writes:
>> I can't see any reason now why overwriteOK should exist at all. I'm
>> guessing that the whole "overwriteOK" idea was an incorrect response
>> to xids appearing where they shouldn't have done because of the
>> mistake you just corrected. So I will now remove the parameter from
>> the call.
>
> Seems reasonable, but I don't like the logic change you made in
> SubTransSetParent; you broke the former invariant, for non-Assert
> builds, that the target pg_subtrans entry is guaranteed to have
> the correct value on exit.  I do like fixing it to not dirty the
> page unnecessarily, but I'd suggest that we write it like
>
> if (*ptr != parent)
> {
> Assert(*ptr == InvalidTransactionId);
> *ptr = parent;
> SubTransCtl->shared->page_dirty[slotno] = true;
> }

OK, thanks. I'll commit that tomorrow.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-25 Thread Tom Lane
Simon Riggs  writes:
> I can't see any reason now why overwriteOK should exist at all. I'm
> guessing that the whole "overwriteOK" idea was an incorrect response
> to xids appearing where they shouldn't have done because of the
> mistake you just corrected. So I will now remove the parameter from
> the call.

Seems reasonable, but I don't like the logic change you made in
SubTransSetParent; you broke the former invariant, for non-Assert
builds, that the target pg_subtrans entry is guaranteed to have
the correct value on exit.  I do like fixing it to not dirty the
page unnecessarily, but I'd suggest that we write it like

if (*ptr != parent)
{
Assert(*ptr == InvalidTransactionId);
*ptr = parent;
SubTransCtl->shared->page_dirty[slotno] = true;
}

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-24 Thread Simon Riggs
On 23 April 2017 at 17:17, Tom Lane  wrote:
> Simon Riggs  writes:
>>> Also, when I fix that, it gets further but still crashes at the same
>>> Assert in SubTransSetParent.  The proximate cause this time seems to be
>>> that RecoverPreparedTransactions's calculation of overwriteOK is wrong:
>>> it's computing that as "false", but in reality the subtrans link in
>>> question has already been set.
>
>> Not sure about that, investigating.
>
> As a quick hack, I just hotwired RecoverPreparedTransactions to set
> overwriteOK = true always, and with that and the SubTransSetParent
> argument-order fix, HEAD passes the recovery tests.  Maybe we can
> be smarter than that, but this might be a good short-term fix to get
> the buildfarm green again.

So my thoughts are now that this is indeed the long term fix.

I can't think of why it would be necessary to call SubTransSetParent()
twice with two different values. A subtransaction's top level xid
never changes, so pg_subtrans should be either zero or the xid of the
parent and never anything else.

I can't see any reason now why overwriteOK should exist at all. I'm
guessing that the whole "overwriteOK" idea was an incorrect response
to xids appearing where they shouldn't have done because of the
mistake you just corrected. So I will now remove the parameter from
the call.

(I'll address the discussion of the impact of that bug in separate post).

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


2PC_remove_overwriteOK.v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-24 Thread Andres Freund
On 2017-04-24 13:29:11 +0100, Simon Riggs wrote:
> On 24 April 2017 at 00:25, Andres Freund  wrote:
> > if the subxid->xid mapping doesn't actually exist - as it's the case
> > with this bug afaics - we'll not get the correct toplevel
> > transaction.
> 
> The nature of the corruption is that in some cases
> * a subxid will point to nothing (even though in most cases it was
> already set correctly)
> * the parent will point to a subxid

Right. Those cases aren't that different from the point of trying to
find the parent of an subxid.


> > Which'll mean the following block:
> > /*
> >  * We now have either a top-level xid higher than xmin or an
> >  * indeterminate xid. We don't know whether it's top level 
> > or subxact
> >  * but it doesn't matter. If it's present, the xid is 
> > visible.
> >  */
> > for (j = 0; j < snapshot->subxcnt; j++)
> > {
> > if (TransactionIdEquals(xid, snapshot->subxip[j]))
> > return true;
> > }
> > won't work correctly if suboverflowed.
> 
> Your example of snapshots taken during recovery is not correct.

Oh?


> Note that SubTransGetTopmostTransaction() returns a valid, running
> xid, even though it is the wrong one.

Sure.


> Snapshots work differently on standbys - we store all known running
> xids, so the test still passes correctly, even when overflowed.

I don't think that's generally true.  Isn't that precisely what
ProcArrayStruct->lastOverflowedXid is about?  If we have a snapshot
that's suboverflowed due to the lastOverflowedXid cutoff, then we the
subxip array does *not* contain all known running xids anymore, we rely
on pg_subtrans to only guarantee that toplevel xids are stored in the
KnownAssignedXids machinery.

See:
 * When we throw away subXIDs from KnownAssignedXids, we need to keep track of
 * that, similarly to tracking overflow of a PGPROC's subxids array.  We do
 * that by remembering the lastOverflowedXID, ie the last thrown-away subXID.
 * As long as that is within the range of interesting XIDs, we have to assume
 * that subXIDs are missing from snapshots.  (Note that subXID overflow occurs
 * on primary when 65th subXID arrives, whereas on standby it occurs when 64th
 * subXID arrives - that is not an error.)

/*
 * Highest subxid that has been removed from KnownAssignedXids array to
 * prevent overflow; or InvalidTransactionId if none.  We track this for
 * similar reasons to tracking overflowing cached subxids in PGXACT
 * entries.  Must hold exclusive ProcArrayLock to change this, and 
shared
 * lock to read it.
 */
TransactionId lastOverflowedXid;


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-24 Thread Nikhil Sontakke
>
>  Also, when I fix that, it gets further but still crashes at the same
>  Assert in SubTransSetParent.  The proximate cause this time seems to
> be
>  that RecoverPreparedTransactions's calculation of overwriteOK is
> wrong:
>  it's computing that as "false", but in reality the subtrans link in
>  question has already been set.
> >>
> >>> Not sure about that, investigating.
> >>
> >> As a quick hack, I just hotwired RecoverPreparedTransactions to set
> >> overwriteOK = true always, and with that and the SubTransSetParent
> >> argument-order fix, HEAD passes the recovery tests.  Maybe we can
> >> be smarter than that, but this might be a good short-term fix to get
> >> the buildfarm green again.
> >
> > That would work. I've been looking into a fix I can explain, but "do
> > it always" may actually be it.
>
>
Here's what's happening:

On Master:

BEGIN;

INSERT INTO t_009_tbl VALUES (42);

SAVEPOINT s1;

INSERT INTO t_009_tbl VALUES (43);

PREPARE TRANSACTION 'xact_009_1';
On Master:

Do a fast shutdown

On Slave:

Restart the slave. This causes StandbyRecoverPreparedTransactions() to be
invoked which causes ProcessTwoPhaseBuffer() to be invoked with setParent
set to true. This sets up parent xid for the above subtransaction.

On Slave:

Promote the slave. This causes RecoverPreparedTransactions() to be invoked
which ends up calling SubTransSetParent() for the above subxid. The
overwriteOK bool remains false since we don't go over the
PGPROC_MAX_CACHED_SUBXIDS limit. Since the parent xid was already set on
restart above, we hit the assert.

---

I am wondering if StandbyRecoverPreparedTransactions() needs the parent
linkage at all? I don't see SubTransGetParent() and related functions being
called on the standby but we need to be careful here. As a quick test, If I
don't call SubTransSetParent() as part of
StandbyRecoverPreparedTransactions()  then all recovery tests pass ok. But
the fact that StandbyRecoverPreparedTransactions() takes overwriteOK as a
parameter means it might not be so simple..

Regards,
Nikhils
-- 
 Nikhil Sontakke   http://www.2ndQuadrant.com/
 PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-24 Thread Simon Riggs
On 24 April 2017 at 00:25, Andres Freund  wrote:

>> >> It's not clear to me how much potential this has to create user data
>> >> corruption, but it doesn't look good at first glance.  Discuss.
>> >
>> > Hm. I think it can cause wrong tqual.c results in some edge cases.
>> > During HS, lastOverflowedXid will be set in some cases, and then
>> > XidInMVCCSnapshot etc will not find the actual toplevel xid, which'll in
>> > turn cause lookups snapshot->subxip (where all HS xids reside)
>> > potentially return wrong results.
>> >
>> > I was about to say that I don't see how it could result in persistent
>> > corruption however - the subtrans lookups are only done for
>> > (snapshot->xmin, snapshot->xmax] and subtrans is truncated
>> > regularly. But these days CHECKPOINT_END_OF_RECOVERY isn't obligatory
>> > anymore, so that might be delayed.  Hm.
>>
>> I've not found any reason, yet, to believe we return wrong answers in
>> any case, even though the transient data structure pg_subtrans is
>> corrupted by the switched call Tom discovers.
>
> I think I pointed out a danger above. Consider what happens if query on
> a standby has a suboverflowed snapshot:
> Snapshot
> GetSnapshotData(Snapshot snapshot)
> {
> ...
> if (TransactionIdPrecedesOrEquals(xmin, 
> procArray->lastOverflowedXid))
> suboverflowed = true;
> }
> ..
> snapshot->suboverflowed = suboverflowed;
> }
>
> In that case we rely on pg_subtrans for visibility determinations:
> bool
> HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
>Buffer buffer)
> {
> ...
> if (!HeapTupleHeaderXminCommitted(tuple))
> {
> ...
> else if (XidInMVCCSnapshot(HeapTupleHeaderGetRawXmin(tuple), 
> snapshot))
> return false;
>
> and
> static bool
> XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
> {
> ...
> if (!snapshot->takenDuringRecovery)
> {
> ...
> else
> {
> ...
> if (snapshot->suboverflowed)
> {
> /*
>  * Snapshot overflowed, so convert xid to top-level.  
> This is safe
>  * because we eliminated too-old XIDs above.
>  */
> xid = SubTransGetTopmostTransaction(xid);
>
> if the subxid->xid mapping doesn't actually exist - as it's the case
> with this bug afaics - we'll not get the correct toplevel
> transaction.

The nature of the corruption is that in some cases
* a subxid will point to nothing (even though in most cases it was
already set correctly)
* the parent will point to a subxid

So both wrong.

> Which'll mean the following block:
> /*
>  * We now have either a top-level xid higher than xmin or an
>  * indeterminate xid. We don't know whether it's top level or 
> subxact
>  * but it doesn't matter. If it's present, the xid is visible.
>  */
> for (j = 0; j < snapshot->subxcnt; j++)
> {
> if (TransactionIdEquals(xid, snapshot->subxip[j]))
> return true;
> }
> won't work correctly if suboverflowed.

Your example of snapshots taken during recovery is not correct.

Note that SubTransGetTopmostTransaction() returns a valid, running
xid, even though it is the wrong one.

Snapshots work differently on standbys - we store all known running
xids, so the test still passes correctly, even when overflowed.

I'd call that just plain luck. We behave correctly, but for the wrong
reasons, at least in this case.


> I don't see anything prevent wrong results here?

I've had an even better look around now and I think I've found
something but need to turn it into a repeatable test case so I can
double-check this before reporting in full, so I don't cry wolf.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-23 Thread Andres Freund
Hi,

On 2017-04-23 11:05:44 +0100, Simon Riggs wrote:
> > Yikes.  This is clearly way undertested.  It's also pretty scary that
> > the code has recently been whacked out quite heavily (both 9.6 and
> > master), without hitting anything around this - doesn't seem to bode
> > well for how in-depth the testing was.
> 
> Obviously if there is a bug it's because tests didn't find it and
> therefore it is by definition undertested for that specific bug.

Sure. But there's bugs that are hard to catch, and there's bugs that are
easily testable. This seems to largely fall into the "relatively easy to
test" camp.


> I'm not sure what other conclusion you wish to draw, if any?

That the changes around whacking around twophase.c in several of the
last releases, didn't yield enough verified testing infrastructure.


> >> It's not clear to me how much potential this has to create user data
> >> corruption, but it doesn't look good at first glance.  Discuss.
> >
> > Hm. I think it can cause wrong tqual.c results in some edge cases.
> > During HS, lastOverflowedXid will be set in some cases, and then
> > XidInMVCCSnapshot etc will not find the actual toplevel xid, which'll in
> > turn cause lookups snapshot->subxip (where all HS xids reside)
> > potentially return wrong results.
> >
> > I was about to say that I don't see how it could result in persistent
> > corruption however - the subtrans lookups are only done for
> > (snapshot->xmin, snapshot->xmax] and subtrans is truncated
> > regularly. But these days CHECKPOINT_END_OF_RECOVERY isn't obligatory
> > anymore, so that might be delayed.  Hm.
> 
> I've not found any reason, yet, to believe we return wrong answers in
> any case, even though the transient data structure pg_subtrans is
> corrupted by the switched call Tom discovers.

I think I pointed out a danger above. Consider what happens if query on
a standby has a suboverflowed snapshot:
Snapshot
GetSnapshotData(Snapshot snapshot)
{
...
if (TransactionIdPrecedesOrEquals(xmin, 
procArray->lastOverflowedXid))
suboverflowed = true;
}
..
snapshot->suboverflowed = suboverflowed;
}

In that case we rely on pg_subtrans for visibility determinations:
bool
HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
   Buffer buffer)
{
...
if (!HeapTupleHeaderXminCommitted(tuple))
{
...
else if (XidInMVCCSnapshot(HeapTupleHeaderGetRawXmin(tuple), 
snapshot))
return false;

and
static bool
XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
{
...
if (!snapshot->takenDuringRecovery)
{
...
else
{
...
if (snapshot->suboverflowed)
{
/*
 * Snapshot overflowed, so convert xid to top-level.  
This is safe
 * because we eliminated too-old XIDs above.
 */
xid = SubTransGetTopmostTransaction(xid);

if the subxid->xid mapping doesn't actually exist - as it's the case
with this bug afaics - we'll not get the correct toplevel
transaction. Which'll mean the following block:
/*
 * We now have either a top-level xid higher than xmin or an
 * indeterminate xid. We don't know whether it's top level or 
subxact
 * but it doesn't matter. If it's present, the xid is visible.
 */
for (j = 0; j < snapshot->subxcnt; j++)
{
if (TransactionIdEquals(xid, snapshot->subxip[j]))
return true;
}
won't work correctly if suboverflowed.

I don't see anything prevent wrong results here?


Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-23 Thread Simon Riggs
On 23 April 2017 at 18:41, Simon Riggs  wrote:
> On 23 April 2017 at 17:17, Tom Lane  wrote:
>> Simon Riggs  writes:
 Also, when I fix that, it gets further but still crashes at the same
 Assert in SubTransSetParent.  The proximate cause this time seems to be
 that RecoverPreparedTransactions's calculation of overwriteOK is wrong:
 it's computing that as "false", but in reality the subtrans link in
 question has already been set.
>>
>>> Not sure about that, investigating.
>>
>> As a quick hack, I just hotwired RecoverPreparedTransactions to set
>> overwriteOK = true always, and with that and the SubTransSetParent
>> argument-order fix, HEAD passes the recovery tests.  Maybe we can
>> be smarter than that, but this might be a good short-term fix to get
>> the buildfarm green again.
>
> That would work. I've been looking into a fix I can explain, but "do
> it always" may actually be it.
>
> OK, I'll do that.

Done, tests pass again for me now.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-23 Thread Simon Riggs
On 23 April 2017 at 17:17, Tom Lane  wrote:
> Simon Riggs  writes:
>>> Also, when I fix that, it gets further but still crashes at the same
>>> Assert in SubTransSetParent.  The proximate cause this time seems to be
>>> that RecoverPreparedTransactions's calculation of overwriteOK is wrong:
>>> it's computing that as "false", but in reality the subtrans link in
>>> question has already been set.
>
>> Not sure about that, investigating.
>
> As a quick hack, I just hotwired RecoverPreparedTransactions to set
> overwriteOK = true always, and with that and the SubTransSetParent
> argument-order fix, HEAD passes the recovery tests.  Maybe we can
> be smarter than that, but this might be a good short-term fix to get
> the buildfarm green again.

That would work. I've been looking into a fix I can explain, but "do
it always" may actually be it.

OK, I'll do that.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-23 Thread Tom Lane
Simon Riggs  writes:
> On 23 April 2017 at 00:55, Tom Lane  wrote:
>> It's not clear to me how much potential this has to create user data
>> corruption, but it doesn't look good at first glance.  Discuss.

> SubTransSetParent() only matters for the case where we are half-way
> through a commit with a large commit. Since we do atomic updates of
> commit and subcommmit when on same page, this problem would only
> matter when top level xid and other subxids were separated across
> multiple pages and the issue would only affect a read only query
> checking visibility during the commit for that prepared transaction.
> Furthermore, the nature of the corruption is that we set the xid to
> point to the subxid; since we never mark a top-level commit as
> subcommitted, subtrans would never be consulted and so the corruption
> would not lead to any incorrect answer even during the window of
> exposure. So it looks to me like this error shouldn't cause a problem.

Still trying to wrap my head around this argument ... I agree that
incorrectly setting the parent's pg_subtrans entry can't cause a
visible problem, because it will never be consulted.  However, failing
to set the child's entry seems like it could cause transient problems.
There would only be a risk during the eventual commit of the prepared
transaction, and only when the pg_xact entries span multiple pages,
but then there would be a window where the child xact is marked
subcommitted but it has a zero entry in pg_subtrans.  That would
result in a WARNING from TransactionIdDidCommit, and a false result,
which depending on timing might mean that commit of the overall
transaction appears non-atomic to onlookers.  (That is, the parent
xact might already appear committed to them, but the subtransaction
not yet.)

I can't find any indication in the archives that anyone's ever reported
seeing that WARNING, though that might only mean that they'd not noticed
it.  But in any case, it seems like the worst consequence is a narrow
window for seeing non-atomic commit of a prepared transaction on a standby
server.  That seems pretty unlikely to cause real-world problems.

The other point about overwriteOK not being set when it needs to be
also seems like it could not cause any problems in production builds,
since that flag is only examined by an Assert.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-23 Thread Tom Lane
Simon Riggs  writes:
>> Also, when I fix that, it gets further but still crashes at the same
>> Assert in SubTransSetParent.  The proximate cause this time seems to be
>> that RecoverPreparedTransactions's calculation of overwriteOK is wrong:
>> it's computing that as "false", but in reality the subtrans link in
>> question has already been set.

> Not sure about that, investigating.

As a quick hack, I just hotwired RecoverPreparedTransactions to set
overwriteOK = true always, and with that and the SubTransSetParent
argument-order fix, HEAD passes the recovery tests.  Maybe we can
be smarter than that, but this might be a good short-term fix to get
the buildfarm green again.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-23 Thread Simon Riggs
On 23 April 2017 at 01:19, Andres Freund  wrote:
> On 2017-04-22 19:55:18 -0400, Tom Lane wrote:
>> Now that we've got consistent failure reports about the 009_twophase.pl
>> recovery test, I set out to find out why it's failing.  It looks to me
>> like the reason is that this (twophase.c:2145):
>>
>> SubTransSetParent(xid, subxid, overwriteOK);
>>
>> ought to be this:
>>
>> SubTransSetParent(subxid, xid, overwriteOK);
>>
>> because the definition of SubTransSetParent is
>>
>> void
>> SubTransSetParent(TransactionId xid, TransactionId parent, bool overwriteOK)
>>
>> not the other way 'round.
>>
>> While "git blame" blames this line on the recent commit 728bd991c,
>> that just moved the call from somewhere else.  AFAICS this has actually
>> been wrong since StandbyRecoverPreparedTransactions was written,
>> in 361bd1662 of 2010-04-13.
>
>> Also, when I fix that, it gets further but still crashes at the same
>> Assert in SubTransSetParent.  The proximate cause this time seems to be
>> that RecoverPreparedTransactions's calculation of overwriteOK is wrong:
>> it's computing that as "false", but in reality the subtrans link in
>> question has already been set.
>>
>
> Yikes.  This is clearly way undertested.  It's also pretty scary that
> the code has recently been whacked out quite heavily (both 9.6 and
> master), without hitting anything around this - doesn't seem to bode
> well for how in-depth the testing was.

Obviously if there is a bug it's because tests didn't find it and
therefore it is by definition undertested for that specific bug.

I'm not sure what other conclusion you wish to draw, if any?


>> It's not clear to me how much potential this has to create user data
>> corruption, but it doesn't look good at first glance.  Discuss.
>
> Hm. I think it can cause wrong tqual.c results in some edge cases.
> During HS, lastOverflowedXid will be set in some cases, and then
> XidInMVCCSnapshot etc will not find the actual toplevel xid, which'll in
> turn cause lookups snapshot->subxip (where all HS xids reside)
> potentially return wrong results.
>
> I was about to say that I don't see how it could result in persistent
> corruption however - the subtrans lookups are only done for
> (snapshot->xmin, snapshot->xmax] and subtrans is truncated
> regularly. But these days CHECKPOINT_END_OF_RECOVERY isn't obligatory
> anymore, so that might be delayed.  Hm.

I've not found any reason, yet, to believe we return wrong answers in
any case, even though the transient data structure pg_subtrans is
corrupted by the switched call Tom discovers.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-23 Thread Simon Riggs
On 23 April 2017 at 00:55, Tom Lane  wrote:
> Now that we've got consistent failure reports about the 009_twophase.pl
> recovery test, I set out to find out why it's failing.  It looks to me
> like the reason is that this (twophase.c:2145):
>
> SubTransSetParent(xid, subxid, overwriteOK);
>
> ought to be this:
>
> SubTransSetParent(subxid, xid, overwriteOK);
>
> because the definition of SubTransSetParent is
>
> void
> SubTransSetParent(TransactionId xid, TransactionId parent, bool overwriteOK)
>
> not the other way 'round.
>
> While "git blame" blames this line on the recent commit 728bd991c,
> that just moved the call from somewhere else.  AFAICS this has actually
> been wrong since StandbyRecoverPreparedTransactions was written,
> in 361bd1662 of 2010-04-13.

Thanks for finding that.

> It's not clear to me how much potential this has to create user data
> corruption, but it doesn't look good at first glance.  Discuss.

Well, strangely for different reason I was looking at that particular
call a couple of days back. I didn't spot that issue, but I was
thinking why we even make that call at that time. My conclusion was
that it could be optimized away mostly, since it isn't needed very
often, but its not really worth optimizing.

SubTransSetParent() only matters for the case where we are half-way
through a commit with a large commit. Since we do atomic updates of
commit and subcommmit when on same page, this problem would only
matter when top level xid and other subxids were separated across
multiple pages and the issue would only affect a read only query
checking visibility during the commit for that prepared transaction.
Furthermore, the nature of the corruption is that we set the xid to
point to the subxid; since we never mark a top-level commit as
subcommitted, subtrans would never be consulted and so the corruption
would not lead to any incorrect answer even during the window of
exposure. So it looks to me like this error shouldn't cause a problem.

We can fix that, but...

> Also, when I fix that, it gets further but still crashes at the same
> Assert in SubTransSetParent.  The proximate cause this time seems to be
> that RecoverPreparedTransactions's calculation of overwriteOK is wrong:
> it's computing that as "false", but in reality the subtrans link in
> question has already been set.

Not sure about that, investigating.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

2017-04-22 Thread Andres Freund
On 2017-04-22 19:55:18 -0400, Tom Lane wrote:
> Now that we've got consistent failure reports about the 009_twophase.pl
> recovery test, I set out to find out why it's failing.  It looks to me
> like the reason is that this (twophase.c:2145):
>
> SubTransSetParent(xid, subxid, overwriteOK);
>
> ought to be this:
>
> SubTransSetParent(subxid, xid, overwriteOK);
>
> because the definition of SubTransSetParent is
>
> void
> SubTransSetParent(TransactionId xid, TransactionId parent, bool overwriteOK)
>
> not the other way 'round.
>
> While "git blame" blames this line on the recent commit 728bd991c,
> that just moved the call from somewhere else.  AFAICS this has actually
> been wrong since StandbyRecoverPreparedTransactions was written,
> in 361bd1662 of 2010-04-13.

> Also, when I fix that, it gets further but still crashes at the same
> Assert in SubTransSetParent.  The proximate cause this time seems to be
> that RecoverPreparedTransactions's calculation of overwriteOK is wrong:
> it's computing that as "false", but in reality the subtrans link in
> question has already been set.
>

Yikes.  This is clearly way undertested.  It's also pretty scary that
the code has recently been whacked out quite heavily (both 9.6 and
master), without hitting anything around this - doesn't seem to bode
well for how in-depth the testing was.


> It's not clear to me how much potential this has to create user data
> corruption, but it doesn't look good at first glance.  Discuss.

Hm. I think it can cause wrong tqual.c results in some edge cases.
During HS, lastOverflowedXid will be set in some cases, and then
XidInMVCCSnapshot etc will not find the actual toplevel xid, which'll in
turn cause lookups snapshot->subxip (where all HS xids reside)
potentially return wrong results.

I was about to say that I don't see how it could result in persistent
corruption however - the subtrans lookups are only done for
(snapshot->xmin, snapshot->xmax] and subtrans is truncated
regularly. But these days CHECKPOINT_END_OF_RECOVERY isn't obligatory
anymore, so that might be delayed.  Hm.


- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers