While working with cluster stuff (DTM, tsDTM) we noted that postgres 2pc 
transactions is approximately two times slower than an ordinary commit on 
workload with fast transactions — few single-row updates and COMMIT or 
PREPARE/COMMIT. Perf top showed that a lot of time is spent in kernel on 
fopen/fclose, so it worth a try to reduce file operations with 2pc tx.

Now 2PC in postgres does following:
* on prepare 2pc data (subxacts, commitrels, abortrels, invalmsgs) saved to 
xlog and to file, but file not is not fsynced
* on commit backend reads data from file
* if checkpoint occurs before commit, then files are fsynced during checkpoint
* if case of crash replay will move data from xlog to files

In this patch I’ve changed this procedures to following:
* on prepare backend writes data only to xlog and store pointer to the start of 
the xlog record
* if commit occurs before checkpoint then backend reads data from xlog by this 
* on checkpoint 2pc data copied to files and fsynced
* if commit happens after checkpoint then backend reads files
* in case of crash replay will move data from xlog to files (as it was before 

Most of that ideas was already mentioned in 2009 thread by Michael Paquier 
 where he suggested to store 2pc data in shared memory. 
At that time patch was declined because no significant speedup were observed. 
Now I see performance improvements by my patch at about 60%. Probably old 
benchmark overall tps was lower and it was harder to hit filesystem 
fopen/fclose limits.

Now results of benchmark are following (dual 6-core xeon server):

Current master without 2PC: ~42 ktps
Current master with 2PC: ~22 ktps
Current master with 2PC: ~36 ktps

Benchmark done with following script:

\set naccounts 100000 * :scale
\setrandom from_aid 1 :naccounts
\setrandom to_aid 1 :naccounts
\setrandom delta 1 100
\set scale :scale+1
UPDATE pgbench_accounts SET abalance = abalance - :delta WHERE aid = :from_aid;
UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :to_aid;
PREPARE TRANSACTION ':client_id.:scale';
COMMIT PREPARED ':client_id.:scale';

Attachment: 2pc_xlog.diff
Description: Binary data

Stas Kelvich
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to