On Tue, Jan 26, 2016 at 7:43 AM, Stas Kelvich wrote:
> Hi,
>
> Thanks for reviews and commit!
I apologize for being clueless here, but was this patch committed?
It's still marked as "Needs Review" in the CommitFest application.
--
Robert Haas
EnterpriseDB:
On Fri, Apr 8, 2016 at 8:49 AM, Jesper Pedersen
wrote:
> On 04/07/2016 02:29 AM, Michael Paquier wrote:
>> So recovery is conflicting here. My guess is that this patch is
>> missing some lock cleanup.
>>
>> With the test case attached in my case the COMMIT PREPARED
> On 08 Apr 2016, at 16:36, Simon Riggs wrote:
>
> On 7 April 2016 at 07:29, Michael Paquier wrote:
>
> With the test case attached in my case the COMMIT PREPARED record does
> not even get replayed.
>
> I was surprised to see this in the
On 7 April 2016 at 07:29, Michael Paquier wrote:
With the test case attached in my case the COMMIT PREPARED record does
> not even get replayed.
>
I was surprised to see this in the test...
sleep 2; # wait for changes to arrive on slave
I think the test framework
> On 07 Apr 2016, at 09:29, Michael Paquier wrote:
>
> relOid=16385) + 358 at standby.c:627
>frame #11: 0x0001023dac6b
> postgres`standby_redo(record=0x7fde50841e38) + 251 at
> standby.c:809
>
> (LOCALLOCK) $1 = {
> tag = {
>lock = {
>
On 04/07/2016 02:29 AM, Michael Paquier wrote:
So recovery is conflicting here. My guess is that this patch is
missing some lock cleanup.
With the test case attached in my case the COMMIT PREPARED record does
not even get replayed.
Should we create an entry for the open item list [0] for
On Wed, Apr 6, 2016 at 6:47 PM, Stas Kelvich wrote:
>> On Apr 2, 2016, at 3:14 AM, Michael Paquier
>> wrote:
>>
>> On Fri, Apr 1, 2016 at 10:53 PM, Stas Kelvich
>> wrote:
>>> I wrote:
While testing the patch,
> On Apr 2, 2016, at 3:14 AM, Michael Paquier wrote:
>
> On Fri, Apr 1, 2016 at 10:53 PM, Stas Kelvich
> wrote:
>> I wrote:
>>> While testing the patch, I found a bug in the recovery conflict code
>>> path. You can do the following to
On Fri, Apr 1, 2016 at 10:53 PM, Stas Kelvich wrote:
> I wrote:
>> While testing the patch, I found a bug in the recovery conflict code
>> path. You can do the following to reproduce it:
>> 1) Start a master with a standby
>> 2) prepare a transaction on master
>> 3) Stop
> On Apr 1, 2016, at 10:04 AM, Michael Paquier
> wrote:
>
> I would suggest the following name modifications, node names have been
> introduced to help tracking of each node's log:
> - Candie => master
> - Django => slave or just standby
> There is no need for
On Wed, Mar 30, 2016 at 10:19 PM, Stas Kelvich wrote:
> Hm, it’s hard to create descriptive names because test changes master/slave
> roles for that nodes several times during test.
Really? the names used in the patch help less then.
> It’s possible to call them
>
On 03/30/2016 09:19 AM, Stas Kelvich wrote:
> +++ b/src/test/recovery/t/006_twophase.pl
> @@ -0,0 +1,226 @@
> +# Checks for recovery_min_apply_delay
> +use strict;
> This description is wrong, this file has been copied from 005.
Yep, done.
>
> +my $node_master = get_new_node("Candie");
> +my
On Mar 29, 2016, at 6:04 PM, David Steele wrote:It looks like you should post a new patch or respond to Michael's comments. Marked as "waiting on author".Yep, here it is.On Mar 22, 2016, at 4:20 PM, Michael Paquier wrote:Looking at this
Hi Stas,
On 3/22/16 9:20 AM, Michael Paquier wrote:
Not taking ProcArrayLock here?
The comment at the top of XlogReadTwoPhaseData is incorrect.
RecoverPreparedFromXLOG and RecoverPreparedFromFiles have a lot of
code in common, having this duplication is not good, and you could
simplify your
On Tue, Mar 22, 2016 at 1:53 AM, Jesper Pedersen
wrote:
> On 03/18/2016 12:50 PM, Stas Kelvich wrote:
>>>
>>> On 11 Mar 2016, at 19:41, Jesper Pedersen
>>> wrote:
>>>
>>
>> Thanks for review, Jesper.
>>
>>> Some comments:
>>>
>>> * The
On 03/18/2016 12:50 PM, Stas Kelvich wrote:
On 11 Mar 2016, at 19:41, Jesper Pedersen wrote:
Thanks for review, Jesper.
Some comments:
* The patch needs a rebase against the latest TwoPhaseFileHeader change
Done.
* Rework the check.sh script into a TAP test
On 1/26/16 3:39 PM, Stas Kelvich wrote:
Agree, I had the same idea in my mind when was writing that script.
I will migrate it to TAP suite and write a review for Michael Paquier's patch.
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 26 Jan
> On 11 Mar 2016, at 19:41, Jesper Pedersen wrote:
>
Thanks for review, Jesper.
> Some comments:
>
> * The patch needs a rebase against the latest TwoPhaseFileHeader change
Done.
> * Rework the check.sh script into a TAP test case (src/test/recovery), as
>
On 01/26/2016 07:43 AM, Stas Kelvich wrote:
Thanks for reviews and commit!
As Simon and Andres already mentioned in this thread replay of twophase
transaction is significantly slower then the same operations in normal mode.
Major reason is that each state file is fsynced during replay and
Stas Kelvich wrote:
> While this patch touches quite sensible part of postgres replay and there is
> some rarely used code paths, I wrote shell script to setup master/slave
> replication and test different failure scenarios that can happened with
> instances. Attaching this file to show test
Agree, I had the same idea in my mind when was writing that script.
I will migrate it to TAP suite and write a review for Michael Paquier's patch.
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
> On 26 Jan 2016, at 20:20, Alvaro Herrera
On Wed, Jan 27, 2016 at 5:39 AM, Stas Kelvich wrote:
> Agree, I had the same idea in my mind when was writing that script.
> I will migrate it to TAP suite and write a review for Michael Paquier's patch.
Yeah, please! And you have won a free-hug coupon that I can give
Hi,
Thanks for reviews and commit!
As Simon and Andres already mentioned in this thread replay of twophase
transaction is significantly slower then the same operations in normal mode.
Major reason is that each state file is fsynced during replay and while it is
not a problem for recovery,
On 12 January 2016 at 06:35, Michael Paquier
wrote:
> On Tue, Jan 12, 2016 at 4:57 AM, Simon Riggs
> wrote:
> > Performance tests for me show that the patch is effective; my results
> match
> > Jesper's roughly in relative numbers.
> >
> > My
On 12 January 2016 at 06:41, Michael Paquier
wrote:
>
> + if (log_checkpoints && n > 0)
> + ereport(LOG,
> + (errmsg("%u two-phase state files were
> written "
> + "for
On Tue, Jan 12, 2016 at 5:21 PM, Simon Riggs wrote:
> Should we just move the code somewhere just to imply it is generic? Seems
> pointless refactoring to me.
Er, why not xlogutils.c? Having the 2PC code depending directly on
something that is within logicalfuncs.c is
Michael Paquier wrote:
> On Tue, Jan 12, 2016 at 5:21 PM, Simon Riggs wrote:
> > Should we just move the code somewhere just to imply it is generic? Seems
> > pointless refactoring to me.
>
> Er, why not xlogutils.c? Having the 2PC code depending directly on
> something
Hi,
On 2016-01-11 19:39:14 +, Simon Riggs wrote:
> Currently, the patch reuses all of the code related to reading/write state
> files, so it is the minimal patch that can implement the important things
> for performance. The current patch succeeds in its goal to improve
> performance, so I
On Tue, Jan 12, 2016 at 5:26 PM, Simon Riggs wrote:
> On 12 January 2016 at 06:41, Michael Paquier
> wrote:
>>
>>
>> + if (log_checkpoints && n > 0)
>> + ereport(LOG,
>> + (errmsg("%u two-phase
On 12 January 2016 at 18:14, Andres Freund wrote:
> Hi,
Thank you for the additional review.
> On 2016-01-11 19:39:14 +, Simon Riggs wrote:
> > Currently, the patch reuses all of the code related to reading/write
> state
> > files, so it is the minimal patch that can
On 12 January 2016 at 12:53, Michael Paquier
wrote:
> On Tue, Jan 12, 2016 at 5:21 PM, Simon Riggs
> wrote:
> > Should we just move the code somewhere just to imply it is generic? Seems
> > pointless refactoring to me.
>
> Er, why not
My +1 for moving function to xlogutils.c too.
Now call to this function goes through series of callbacks so it is hard to
find it.
Personally I found it only after I have implemented same function by myself
(based on code in pg_xlogdump).
Stas Kelvich
Postgres Professional:
>
> On 11 Jan 2016, at 21:40, Jesper Pedersen wrote:
>
> I have done a run with the patch and it looks really great.
>
> Attached is the TPS graph - with a 1pc run too - and the perf profile as a
> flame graph (28C/56T w/ 256Gb mem, 2 x RAID10 SSD).
>
Thanks for
Hi,
On January 11, 2016 10:46:01 PM GMT+01:00, Simon Riggs
wrote:
>On 11 January 2016 at 20:10, Andres Freund wrote:
>
>> On January 11, 2016 8:57:58 PM GMT+01:00, Simon Riggs
>> wrote:
>> >On 11 January 2016 at 18:43, Simon
On 11 January 2016 at 20:10, Andres Freund wrote:
> On January 11, 2016 8:57:58 PM GMT+01:00, Simon Riggs
> wrote:
> >On 11 January 2016 at 18:43, Simon Riggs wrote:
>
> >It's clear there are various additional tuning
On 11 January 2016 at 22:24, Andres Freund wrote:
> Please wait till at least tomorrow evening, so I can have a meaningful
> look.
>
No problem, make sure to look at 2pc_optimize.v4.patch
--
Simon Riggshttp://www.2ndQuadrant.com/
On 11 January 2016 at 23:11, Stas Kelvich wrote:
> >
>
> On 11 Jan 2016, at 21:43, Simon Riggs wrote:
> >
> > Have you measured lwlocking as a problem?
> >
>
> Yes. GXACT locks that wasn’t even in perf top 10 on dual Xeon moves to the
> first
On Tue, Jan 12, 2016 at 4:57 AM, Simon Riggs wrote:
> Performance tests for me show that the patch is effective; my results match
> Jesper's roughly in relative numbers.
>
> My robustness review is that the approach and implementation are safe.
>
> It's clear there are
On Tue, Jan 12, 2016 at 3:35 PM, Michael Paquier
wrote:
> On Tue, Jan 12, 2016 at 4:57 AM, Simon Riggs wrote:
>> Performance tests for me show that the patch is effective; my results match
>> Jesper's roughly in relative numbers.
>>
>> My
> On 10 Jan 2016, at 12:15, Simon Riggs wrote:
>
> So we've only optimized half the usage? We're still going to cause
> replication delays.
Yes, replica will go through old procedures of moving data to and from file.
> We can either
>
> 1) Skip fsyncing the
On 11 January 2016 at 12:58, Stas Kelvich wrote:
>
> > On 10 Jan 2016, at 12:15, Simon Riggs wrote:
> >
> > So we've only optimized half the usage? We're still going to cause
> replication delays.
>
> Yes, replica will go through old procedures
Hi,
On 2016-01-09 15:29:11 +, Simon Riggs wrote:
> Hmm, I was just preparing this for commit.
Just read downthread that you want to commit this soon. Please hold of
for a while, this doesn't really look ready to me. I don't have time for
a real review right now, but I'll try to get to it
On 2016-01-11 20:03:18 +0100, Andres Freund wrote:
> More generally, I'm doubtful that the approach of reading data from WAL
> as proposed here is a very good idea. It seems better to "just" dump the
> entire 2pc state into *one* file at checkpoint time.
Or better: After determining the
On 11 January 2016 at 19:07, Andres Freund wrote:
> On 2016-01-11 20:03:18 +0100, Andres Freund wrote:
> > More generally, I'm doubtful that the approach of reading data from WAL
> > as proposed here is a very good idea. It seems better to "just" dump the
> > entire 2pc state
On 11 January 2016 at 18:51, Jesper Pedersen
wrote:
> On 01/10/2016 04:15 AM, Simon Riggs wrote:
>
>> One concern that come into my mind while reading updated
>>> patch is about creating extra bool field in GlobalTransactionData
>>> structure. While this improves
On 2016-01-11 19:15:23 +, Simon Riggs wrote:
> On 11 January 2016 at 19:03, Andres Freund wrote:
>
> > Hi,
> >
> > On 2016-01-09 15:29:11 +, Simon Riggs wrote:
> > > Hmm, I was just preparing this for commit.
> >
> > Just read downthread that you want to commit this
On 11 January 2016 at 18:43, Simon Riggs wrote:
> I'm looking to commit what we have now.
>
Here is the patch in its "final" state after my minor additions, edits and
review.
Performance tests for me show that the patch is effective; my results match
Jesper's roughly in
On January 11, 2016 8:57:58 PM GMT+01:00, Simon Riggs
wrote:
>On 11 January 2016 at 18:43, Simon Riggs wrote:
>It's clear there are various additional tuning opportunities, but the
>objective of the current patch to improve performance is very,
On 11 January 2016 at 19:03, Andres Freund wrote:
> Hi,
>
> On 2016-01-09 15:29:11 +, Simon Riggs wrote:
> > Hmm, I was just preparing this for commit.
>
> Just read downthread that you want to commit this soon. Please hold of
> for a while, this doesn't really look ready
On 9 January 2016 at 20:28, Stas Kelvich wrote:
> Thanks a lot for your edits, now that patch is much more cleaner.
>
> > Your comments say
> >
> > "In case of crash replay will move data from xlog to files, if that
> hasn't happened before."
> >
> > but I don't see
On 9 January 2016 at 12:26, Stas Kelvich wrote:
> I’ve updated patch and wrote description of thighs that happens
> with 2PC state data in the beginning of the file. I think now this patch
> is well documented,
> but if somebody points me to places that probably
Hi.
I’ve updated patch and wrote description of thighs that happens
with 2PC state data in the beginning of the file. I think now this patch is
well documented,
but if somebody points me to places that probably requires more detailed
description I’m ready
to extend that.
2pc_xlog.diff
On 9 January 2016 at 12:26, Stas Kelvich wrote:
> I’ve updated patch and wrote description of thighs that happens
> with 2PC state data in the beginning of the file. I think now this patch
> is well documented,
> but if somebody points me to places that probably
Simon Riggs wrote:
> I think we could do better still, but this looks like the easiest 80% and
> actually removes code.
>
> The lack of substantial comments on the patch is a problem though - the
> details above should go in the patch. I'll have a go at reworking this for
> you, this time.
Is
On 9 December 2015 at 18:44, Stas Kelvich wrote:
> In this patch I’ve changed this procedures to following:
> * on prepare backend writes data only to xlog and store pointer to the
> start of the xlog record
> * if commit occurs before checkpoint then backend reads
Michael, Jeff thanks for reviewing and testing.
> On 10 Dec 2015, at 02:16, Michael Paquier wrote:
>
> This has better be InvalidXLogRecPtr if unused.
Yes, that’s better. Changed.
> On 10 Dec 2015, at 02:16, Michael Paquier wrote:
> +
Hello.
While working with cluster stuff (DTM, tsDTM) we noted that postgres 2pc
transactions is approximately two times slower than an ordinary commit on
workload with fast transactions — few single-row updates and COMMIT or
PREPARE/COMMIT. Perf top showed that a lot of time is spent in kernel
On Wed, Dec 9, 2015 at 12:44 PM, Stas Kelvich wrote:
> Now 2PC in postgres does following:
> * on prepare 2pc data (subxacts, commitrels, abortrels, invalmsgs) saved to
> xlog and to file, but file not is not fsynced
> * on commit backend reads data from file
> * if
Thanks, Kevin.
> I assume that last one should have been *Patched master with 2PC”?
Yes, this list should look like this:
Current master without 2PC: ~42 ktps
Current master with 2PC: ~22 ktps
Patched master with 2PC: ~36 ktps
And created CommitFest entry for this patch.
--
Stas Kelvich
On Thu, Dec 10, 2015 at 3:44 AM, Stas Kelvich wrote:
> Most of that ideas was already mentioned in 2009 thread by Michael Paquier
> http://www.postgresql.org/message-id/c64c5f8b0908062031k3ff48428j824a9a46f2818...@mail.gmail.com
> where he suggested to store 2pc data
On Wed, Dec 9, 2015 at 10:44 AM, Stas Kelvich wrote:
> Hello.
>
> While working with cluster stuff (DTM, tsDTM) we noted that postgres 2pc
> transactions is approximately two times slower than an ordinary commit on
> workload with fast transactions — few single-row
101 - 161 of 161 matches
Mail list logo