Re: [HACKERS] Bug in two-phase transaction recovery

2016-09-09 Thread Simon Riggs
On 8 September 2016 at 11:18, Simon Riggs  wrote:
> On 8 September 2016 at 07:43, Michael Paquier  
> wrote:
>> On Wed, Sep 7, 2016 at 10:48 PM, Stas Kelvich  
>> wrote:
>>> Some time ago two-phase state file format was changed to have variable size 
>>> GID,
>>> but several places that read that files were not updated to use new 
>>> offsets. Problem
>>> exists in master and 9.6 and can be reproduced on prepared transactions with
>>> savepoints.
>>
>> Oops and meh. This meritates an open item, and has better be fixed by
>> 9.6.0. I am glad you noticed that honestly. And we had better take
>> care of this issue as soon as possible.
>
> Looking now.

Committed, thanks.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in two-phase transaction recovery

2016-09-08 Thread Simon Riggs
On 8 September 2016 at 07:43, Michael Paquier  wrote:
> On Wed, Sep 7, 2016 at 10:48 PM, Stas Kelvich  
> wrote:
>> Some time ago two-phase state file format was changed to have variable size 
>> GID,
>> but several places that read that files were not updated to use new offsets. 
>> Problem
>> exists in master and 9.6 and can be reproduced on prepared transactions with
>> savepoints.
>
> Oops and meh. This meritates an open item, and has better be fixed by
> 9.6.0. I am glad you noticed that honestly. And we had better take
> care of this issue as soon as possible.

Looking now.

>> Also while looking at StandbyRecoverPreparedTransactions() i’ve noticed that 
>> buffer
>> for 2pc file is allocated in TopMemoryContext but never freed. That probably 
>> exists
>> for a long time.
>
> @@ -1886,6 +1886,8 @@ StandbyRecoverPreparedTransactions(bool overwriteOK)
> Assert(TransactionIdFollows(subxid, xid));
> SubTransSetParent(xid, subxid, overwriteOK);
> }
> +
> +   pfree(buf);
> }
> This one is a good catch. I have checked also the other callers of
> ReadTwoPhaseFile but I am not seeing any other leak. That's a leak,
> not critical though so applying it only on HEAD would be enough IMO.

Far from critical, but backpatched to 9.6 because it isn't just
executed once at startup, it is executed every shutdown checkpoint.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in two-phase transaction recovery

2016-09-08 Thread Pavan Deolasee
On Thu, Sep 8, 2016 at 12:13 PM, Michael Paquier 
wrote:

> On Wed, Sep 7, 2016 at 10:48 PM, Stas Kelvich 
> wrote:
> > Some time ago two-phase state file format was changed to have variable
> size GID,
> > but several places that read that files were not updated to use new
> offsets. Problem
> > exists in master and 9.6 and can be reproduced on prepared transactions
> with
> > savepoints.
>
> Oops and meh. This meritates an open item, and has better be fixed by
> 9.6.0. I am glad you noticed that honestly. And we had better take
> care of this issue as soon as possible.
>
>
Indeed, it's a bug. Thanks Stas for tracking it down and Michael for the
review and checking other places. I also looked at the patch and it seems
fine to me. FWIW I looked at all other places where TwoPhaseFileHeader is
referred and they look safe to me.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Bug in two-phase transaction recovery

2016-09-08 Thread Michael Paquier
On Wed, Sep 7, 2016 at 10:48 PM, Stas Kelvich  wrote:
> Some time ago two-phase state file format was changed to have variable size 
> GID,
> but several places that read that files were not updated to use new offsets. 
> Problem
> exists in master and 9.6 and can be reproduced on prepared transactions with
> savepoints.

Oops and meh. This meritates an open item, and has better be fixed by
9.6.0. I am glad you noticed that honestly. And we had better take
care of this issue as soon as possible.

> For example:
>
> create table t(id int);
> begin;
> insert into t values (42);
> savepoint s1;
> insert into t values (43);
> prepare transaction 'x';
> begin;
> insert into t values (142);
> savepoint s1;
> insert into t values (143);
> prepare transaction 'y’;
>
> and restart the server. Startup process will hang. Fix attached.

In crash recovery this is very likely to fail with an assertion if
those are enabled:
TRAP: FailedAssertion("!(TransactionIdFollows(subxid, xid))", File:
"twophase.c", Line: 1767)
And the culprit is e0694cf9, which makes this open item owned by Simon.

I also had a look at the patch you are proposing, and I think that's a
correct fix. I also looked at the other code paths scanning the 2PC
state file and did not notice extra problems. The good news is that
the 2PC file generation is not impacted, only its scan at recovery is,
so the impact of the bug is limited for existing 9.6 deployments where
both 2PC and subtransactions are involved.

> Also while looking at StandbyRecoverPreparedTransactions() i’ve noticed that 
> buffer
> for 2pc file is allocated in TopMemoryContext but never freed. That probably 
> exists
> for a long time.

@@ -1886,6 +1886,8 @@ StandbyRecoverPreparedTransactions(bool overwriteOK)
Assert(TransactionIdFollows(subxid, xid));
SubTransSetParent(xid, subxid, overwriteOK);
}
+
+   pfree(buf);
}
This one is a good catch. I have checked also the other callers of
ReadTwoPhaseFile but I am not seeing any other leak. That's a leak,
not critical though so applying it only on HEAD would be enough IMO.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers