Re: [HACKERS] Merge compact/non compact commits, make aborts dynamically sized

2015-03-15 Thread Andres Freund
On 2015-03-02 18:45:18 +0100, Andres Freund wrote:
 On 2015-03-02 19:23:56 +0200, Heikki Linnakangas wrote:
  Pass the prepared XID as yet another argument to XactEmitCommitRecord, and
  have XactEmitCommitRecord emit the xl_xact_commit_prepared part of the
  record too. It might even make sense to handle the prepared XID like all the
  other optional fields and add an xinfo flag for it.
 
 That's what I mean with non simple. Not a fan of teaching xact.c even
 more about twophase's dealings than it already knows.

I made an effort to show how horrible it would look like. Turns out it's
not that bad ;). Far from pretty, especially in xlog.c, but
acceptable. I think treating them more similar actually might open the
road for reducing the difference further at some point.

The attached patch does pretty much what you suggested above and tries
to address all the point previously made. I plan to push this fairly
soon; unless somebody has further input.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 7557d0afa860c37a1992240b54e2bf3710296fc7 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Sun, 15 Mar 2015 14:42:09 +0100
Subject: [PATCH] Merge the various forms of transaction commit  abort
 records.

Since 465883b0a two versions of commit records have existed. A compact
version that was used when no cache invalidations, smgr unlinks and
similar were needed, and a full version that could deal with all
that. Additionally the full version was embedded into twophase commit
records.

That resulted in a measurable reduction in the size of the logged WAL in
some workloads. But more recently additions like logical decoding, which
e.g. needs information about the database something was executed on,
made it applicable in fewer situations. The static split generally made
it hard to expand the commit record, because concerns over the size made
it hard to add anything to the compact version.

Additionally it's not particularly pretty to have twophase.c insert
RM_XACT records.

Rejigger things so that the commit and abort records only have one form
each, including the twophase equivalents. The presence of the various
optional (in the sense of not being in every record) pieces is indicated
by a bits in the 'xinfo' flag.  That flag previously was not included in
compact commit records. To prevent an increase in size due to its
presence, it's only included if necessary; signalled by a bit in the
xl_info bits available for xact.c, similar to heapam.c's
XLOG_HEAP_OPMASK/XLOG_HEAP_INIT_PAGE.

Twophase commit/aborts are now the same as their normal
counterparts. The original transaction's xid is included in an optional
data field.

This means that commit records generally are smaller, except in the case
of a transaction with subtransactions, but no other special cases; the
increase there is four bytes, which seems acceptable given that the more
common case of not having subtransactions shrank.  The savings are
especially measurable for twophase commits, which previously always used
the full version; but will in practice only infrequently have required
that.

The motivation for this work are not the space savings and and
deduplication though; it's that it makes it easier to extend commit
records with additional information. That's just a few lines of code
now; without impacting the common case where that information is not
needed.

Discussion: 20150220152150.gd4...@awork2.anarazel.de,
235610.92468.qm%40web29004.mail.ird.yahoo.com

Reviewed-By: Heikki Linnakangas, Simon Riggs
---
 src/backend/access/rmgrdesc/xactdesc.c   | 245 +++-
 src/backend/access/transam/twophase.c|  59 +---
 src/backend/access/transam/xact.c| 476 +++
 src/backend/access/transam/xlog.c| 126 
 src/backend/replication/logical/decode.c | 133 +++--
 src/include/access/xact.h| 215 ++
 src/include/access/xlog_internal.h   |   2 +-
 7 files changed, 749 insertions(+), 507 deletions(-)

diff --git a/src/backend/access/rmgrdesc/xactdesc.c b/src/backend/access/rmgrdesc/xactdesc.c
index 3e87978..b036b6d 100644
--- a/src/backend/access/rmgrdesc/xactdesc.c
+++ b/src/backend/access/rmgrdesc/xactdesc.c
@@ -14,53 +14,188 @@
  */
 #include postgres.h
 
+#include access/transam.h
 #include access/xact.h
 #include catalog/catalog.h
 #include storage/sinval.h
 #include utils/timestamp.h
 
+/*
+ * Parse the WAL format of a xact commit and abort records into a easier to
+ * understand format.
+ *
+ * This routines are in xactdesc.c because they're accessed in backend (when
+ * replaying WAL) and frontend (pg_xlogdump) code. This file is the only xact
+ * specific one shared between both. They're complicated enough that
+ * duplication would be bothersome.
+ */
+
+void
+ParseCommitRecord(uint8 info, xl_xact_commit *xlrec, xl_xact_parsed_commit 

Re: [HACKERS] Merge compact/non compact commits, make aborts dynamically sized

2015-03-15 Thread Andres Freund
On 2015-03-15 16:30:16 +0100, Andres Freund wrote:
 The attached patch does pretty much what you suggested above and tries
 to address all the point previously made. I plan to push this fairly
 soon; unless somebody has further input.

Pushed, after fixing two typos in decode.c I introduced while cleaning
up. This might not yet be perfect, but there seems little point in
waiting further.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Merge compact/non compact commits, make aborts dynamically sized

2015-03-02 Thread Andres Freund
On 2015-03-02 19:11:15 +0200, Heikki Linnakangas wrote:
 Come to think of it, it would be cleaner anyway to move the
 XLogBeginInsert() and XLogInsert() calls inside XactEmitCommitRecord. Then
 those structs don't need to be static either.

That was my first thought as well - but it doesn't easily work because
of the way commit/abort records are embedded into twophase.c.  I
couldn't come with a simple way to change that, and anythign non simple
imo defeats the purpose.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Merge compact/non compact commits, make aborts dynamically sized

2015-03-02 Thread Andres Freund
On 2015-02-25 12:10:42 +0100, Andres Freund wrote:
 On 2015-02-24 20:51:42 +0200, Heikki Linnakangas wrote:
  Regarding XactEmitCommitRecord and XactEmitAbortRecord, I wonder if you
  could pass an xl_xact_parsed/abort_commit struct to them, instead of the
  individual fields? You could then also avoid the static variables inside it,
  passing pointers to that struct to XLogRegisterData() instead.
 
 Hm, that's an idea. And rename it to xaxt_commit/abort_data?

So, I just tried this, and it doesn't really seem to come out as a net
positive. More code at the callsites and more complex code in the *Emit*
routines. It's impossible to use [FLEXIBLE_ARRAY_MEMBER] employing
datatypes in xact_commit_data while emitting because there obviously are
several chunks needing it. And avoid using it would make for a slightly
less clear format.

So unless you feel strongly about it, don't think so, I'll keep the
statics, even if they're not particularly pretty.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Merge compact/non compact commits, make aborts dynamically sized

2015-03-02 Thread Heikki Linnakangas

On 03/02/2015 06:51 PM, Andres Freund wrote:

On 2015-02-25 12:10:42 +0100, Andres Freund wrote:

On 2015-02-24 20:51:42 +0200, Heikki Linnakangas wrote:

Regarding XactEmitCommitRecord and XactEmitAbortRecord, I wonder if you
could pass an xl_xact_parsed/abort_commit struct to them, instead of the
individual fields? You could then also avoid the static variables inside it,
passing pointers to that struct to XLogRegisterData() instead.


Hm, that's an idea. And rename it to xaxt_commit/abort_data?


So, I just tried this, and it doesn't really seem to come out as a net
positive. More code at the callsites and more complex code in the *Emit*
routines. It's impossible to use [FLEXIBLE_ARRAY_MEMBER] employing
datatypes in xact_commit_data while emitting because there obviously are
several chunks needing it. And avoid using it would make for a slightly
less clear format.

So unless you feel strongly about it, don't think so, I'll keep the
statics, even if they're not particularly pretty.


Come to think of it, it would be cleaner anyway to move the 
XLogBeginInsert() and XLogInsert() calls inside XactEmitCommitRecord. 
Then those structs don't need to be static either.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Merge compact/non compact commits, make aborts dynamically sized

2015-03-02 Thread Heikki Linnakangas

On 03/02/2015 07:14 PM, Andres Freund wrote:

On 2015-03-02 19:11:15 +0200, Heikki Linnakangas wrote:

Come to think of it, it would be cleaner anyway to move the
XLogBeginInsert() and XLogInsert() calls inside XactEmitCommitRecord. Then
those structs don't need to be static either.


That was my first thought as well - but it doesn't easily work because
of the way commit/abort records are embedded into twophase.c.  I
couldn't come with a simple way to change that, and anythign non simple
imo defeats the purpose.


Pass the prepared XID as yet another argument to XactEmitCommitRecord, 
and have XactEmitCommitRecord emit the xl_xact_commit_prepared part of 
the record too. It might even make sense to handle the prepared XID like 
all the other optional fields and add an xinfo flag for it.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Merge compact/non compact commits, make aborts dynamically sized

2015-02-28 Thread Robert Haas
On Fri, Feb 27, 2015 at 7:10 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 No, no. I meant that it is good the way your patch does it in
 xactdesc.c, where both frontend and backend can reach it.

Agreed, that seems much better than duplicating it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Merge compact/non compact commits, make aborts dynamically sized

2015-02-27 Thread Andres Freund
On 2015-02-27 16:26:08 +0900, Michael Paquier wrote:
 On Wed, Feb 25, 2015 at 8:10 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2015-02-24 20:51:42 +0200, Heikki Linnakangas wrote:
  On 02/20/2015 05:21 PM, Andres Freund wrote:
  There's one bit that I'm not so sure about though: To avoid duplication
  I've added Parse(Commit/Abort)Record(), but unfortunately that has to be
  available both in front and backend code - so it's currently living in
  xactdesc.c. I think we can live with that, but it's certainly not
  pretty.
 
  Yeah, that's ugly. Why does frontend code need that? The old format
  isn't exactly trivial for frontend code to decode either.
 
  pg_xlogdump outputs subxacts and such; I don't forsee other
  usages. Sure, we could copy the code around, but I think that's worse
  than having it in xactdesc.c. Needs a comment explaining why it's there
  if I haven't added one already.
 
 FWIW, I think they would live better in frontend code for client applications.

What do you mean with that? You mean you'd rather see a copy in
pg_xlogdump somewhere? How would you trigger that being used instead of
the normal description routine?


 +/* free opcode 0x70 */
 +
 +#define XLOG_XACT_OPMASK   0x70
 There is a contradiction here, 0x70 is not free.

The above is a mask, not an opcode - so I don't think it's a
contraction. I'll expand on the commentary on the next version.

Thanks for having a look!

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Merge compact/non compact commits, make aborts dynamically sized

2015-02-27 Thread Michael Paquier
On Fri, Feb 27, 2015 at 6:49 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2015-02-27 16:26:08 +0900, Michael Paquier wrote:
 On Wed, Feb 25, 2015 at 8:10 PM, Andres Freund and...@2ndquadrant.com 
 wrote:
  On 2015-02-24 20:51:42 +0200, Heikki Linnakangas wrote:
  On 02/20/2015 05:21 PM, Andres Freund wrote:
  There's one bit that I'm not so sure about though: To avoid duplication
  I've added Parse(Commit/Abort)Record(), but unfortunately that has to be
  available both in front and backend code - so it's currently living in
  xactdesc.c. I think we can live with that, but it's certainly not
  pretty.
 
  Yeah, that's ugly. Why does frontend code need that? The old format
  isn't exactly trivial for frontend code to decode either.
 
  pg_xlogdump outputs subxacts and such; I don't forsee other
  usages. Sure, we could copy the code around, but I think that's worse
  than having it in xactdesc.c. Needs a comment explaining why it's there
  if I haven't added one already.

 FWIW, I think they would live better in frontend code for client 
 applications.

 What do you mean with that? You mean you'd rather see a copy in
 pg_xlogdump somewhere? How would you trigger that being used instead of
 the normal description routine?

No, no. I meant that it is good the way your patch does it in
xactdesc.c, where both frontend and backend can reach it.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Merge compact/non compact commits, make aborts dynamically sized

2015-02-26 Thread Michael Paquier
On Wed, Feb 25, 2015 at 8:10 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2015-02-24 20:51:42 +0200, Heikki Linnakangas wrote:
 On 02/20/2015 05:21 PM, Andres Freund wrote:
 There's one bit that I'm not so sure about though: To avoid duplication
 I've added Parse(Commit/Abort)Record(), but unfortunately that has to be
 available both in front and backend code - so it's currently living in
 xactdesc.c. I think we can live with that, but it's certainly not
 pretty.

 Yeah, that's ugly. Why does frontend code need that? The old format
 isn't exactly trivial for frontend code to decode either.

 pg_xlogdump outputs subxacts and such; I don't forsee other
 usages. Sure, we could copy the code around, but I think that's worse
 than having it in xactdesc.c. Needs a comment explaining why it's there
 if I haven't added one already.

FWIW, I think they would live better in frontend code for client applications.

That's a nice patch. +1 for merging them. Here are a couple of comments:

+/* Parse the WAL format of a xact abort into a easier to understand format. */
+void
+ParseCommitRecord(uint8 info, xl_xact_commit *xlrec,
xl_xact_parsed_commit *parsed)
I think that you mean here of an xact commit, not abort.

+ * Emit, but don't insert, a abort record.
s/a abort/an abort/
XactEmitAbortRecord has some problems with tabs replaced by 4 spaces
at a couple of places.

+/* free opcode 0x70 */
+
+#define XLOG_XACT_OPMASK   0x70
There is a contradiction here, 0x70 is not free.

Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Merge compact/non compact commits, make aborts dynamically sized

2015-02-25 Thread Simon Riggs
On 24 February 2015 at 18:51, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 On 02/20/2015 05:21 PM, Andres Freund wrote:

 In the attached patch I've merged compact/noncompact code, made aborts
 use similar logic to avoid including useless bytes and used both for the
 2pc equivalents.


 +1 for this approach in general.

+1 also

The compact commit record idea wasn't useful enough to stick with,
especially in the light of logical rep, possible future extensions to
the commit record and the prevalence of subtransactions.

XactEmitCommitRecord and XactEmitAbortRecord are somewhat hard to
understand. More comments or clearer design please.

The record format itself needs more explanation than I see in the
patch. Might be worth documenting when and how those fields would be
populated also (my bad).

It might be useful to have the optional field descriptions be named
with a _opt suffix to make it clearer.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, RemoteDBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Merge compact/non compact commits, make aborts dynamically sized

2015-02-25 Thread Andres Freund
Hi,

On 2015-02-24 20:51:42 +0200, Heikki Linnakangas wrote:
 On 02/20/2015 05:21 PM, Andres Freund wrote:
 To avoid using more space in the compact case the 'xinfo' field
 indicating the presence of further data is only included when a byte in
 the xl_info flag is set (similar to what heap rmgr does). That means
 that transactions without subtransactions and other things are four
 bytes smaller than before, but ones with a subtransaction but no other
 complications 4 byte larger. database info, nsubxacts, nrels, nmsgs et
 al are also only included if required. I think that's a overall win,
 even disregarding wal_level = logical.
 
 xinfo is a bit underdocumented now.

Yea, the whole thing needs a bit more comments. I really wanted some
feedback before sinking even more time into this...

 The comment in xl_xact_commit should mention that it's a uint32. But
 actually, why is it 32 bits wide?

The first reason is that it's essentially just the old 'xinfo' field
moved to a different place ;). I think I'll add a xl_xact_flags or so,
and document it there.

 Only 6 bits are used, so it would in a single byte. I guess that's
 because of alignment: now that it's 32 bits wide, that ensures that
 all the other structs are 4 byte aligned. That should be
 documented. Alternatively, store them unaligned to save the few bytes
 and use memcpy.

But yes, I didn't think about changing it because of alignment. Given
that commit and abort records can be fairly large, it seems better to
waste three bytes in a few scenarios (not the minimal one!) than
allocate another large chunk of memory for the copied data.


Even in the current code there's some rather underdocumented assumptions
about alignment; namely that RelFileNode+, TransactionId+ always result
in a acceptable alignment for the data after them. That happens to be
the case, but is worth a comment and/or a MAXALIGN.

 There's one bit that I'm not so sure about though: To avoid duplication
 I've added Parse(Commit/Abort)Record(), but unfortunately that has to be
 available both in front and backend code - so it's currently living in
 xactdesc.c. I think we can live with that, but it's certainly not
 pretty.
 
 Yeah, that's ugly. Why does frontend code need that? The old format
 isn't exactly trivial for frontend code to decode either.

pg_xlogdump outputs subxacts and such; I don't forsee other
usages. Sure, we could copy the code around, but I think that's worse
than having it in xactdesc.c. Needs a comment explaining why it's there
if I haven't added one already.

 Regarding XactEmitCommitRecord and XactEmitAbortRecord, I wonder if you
 could pass an xl_xact_parsed/abort_commit struct to them, instead of the
 individual fields? You could then also avoid the static variables inside it,
 passing pointers to that struct to XLogRegisterData() instead.

Hm, that's an idea. And rename it to xaxt_commit/abort_data?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Merge compact/non compact commits, make aborts dynamically sized

2015-02-24 Thread Heikki Linnakangas

On 02/20/2015 05:21 PM, Andres Freund wrote:

In the attached patch I've merged compact/noncompact code, made aborts
use similar logic to avoid including useless bytes and used both for the
2pc equivalents.


+1 for this approach in general.


To avoid using more space in the compact case the 'xinfo' field
indicating the presence of further data is only included when a byte in
the xl_info flag is set (similar to what heap rmgr does). That means
that transactions without subtransactions and other things are four
bytes smaller than before, but ones with a subtransaction but no other
complications 4 byte larger. database info, nsubxacts, nrels, nmsgs et
al are also only included if required. I think that's a overall win,
even disregarding wal_level = logical.


xinfo is a bit underdocumented now. The comment in xl_xact_commit should 
mention that it's a uint32. But actually, why is it 32 bits wide? Only 6 
bits are used, so it would in a single byte. I guess that's because of 
alignment: now that it's 32 bits wide, that ensures that all the other 
structs are 4 byte aligned. That should be documented. Alternatively, 
store them unaligned to save the few bytes and use memcpy.



There's one bit that I'm not so sure about though: To avoid duplication
I've added Parse(Commit/Abort)Record(), but unfortunately that has to be
available both in front and backend code - so it's currently living in
xactdesc.c. I think we can live with that, but it's certainly not
pretty.


Yeah, that's ugly. Why does frontend code need that? The old format 
isn't exactly trivial for frontend code to decode either.


Regarding XactEmitCommitRecord and XactEmitAbortRecord, I wonder if you 
could pass an xl_xact_parsed/abort_commit struct to them, instead of the 
individual fields? You could then also avoid the static variables inside 
it, passing pointers to that struct to XLogRegisterData() instead.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers