Re: [HACKERS] Reducing size of WAL record headers

2013-08-23 Thread Jim Nasby

On 1/10/13 6:14 PM, Simon Riggs wrote:

On 10 January 2013 20:13, Tom Lane t...@sss.pgh.pa.us wrote:

Bruce Momjian br...@momjian.us writes:

On Wed, Jan  9, 2013 at 05:06:49PM -0500, Tom Lane wrote:

Let's wait till we see where the logical rep stuff ends up before we
worry about saving 4 bytes per WAL record.



Well, we have wal_level to control the amount of WAL traffic.


That's entirely irrelevant.  The point here is that we'll need more bits
to identify what any particular record is, unless we make a decision
that we'll have physically separate streams for logical replication
info, which doesn't sound terribly attractive; and in any case no such
decision has been made yet, AFAIK.


You were right to say that this is less important than logical
replication. I don't need any more reason than that to stop talking
about it.

I have a patch for this, but as yet no way to submit it while at the
same time saying put this at the back of the queue.


Anything ever come of this?
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing size of WAL record headers

2013-01-10 Thread Bruce Momjian
On Wed, Jan  9, 2013 at 05:06:49PM -0500, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  Overall, the WAL record is MAXALIGN'd, so with 8 byte alignment we
  waste 4 bytes per record. Or put another way, if we could reduce
  record header by 4 bytes, we would actually reduce it by 8 bytes per
  record. So looking for ways to do that seems like a good idea.
 
 I think this is extremely premature, in view of the ongoing discussions
 about shoehorning logical replication and other kinds of data into the
 WAL stream.  It seems quite likely that we'll end up eating some of
 that padding space to support those features.  So whacking a lot of code
 around in service of squeezing the existing padding out could very
 easily end up being wasted work, in fact counterproductive if it
 degrades either code readability or robustness.
 
 Let's wait till we see where the logical rep stuff ends up before we
 worry about saving 4 bytes per WAL record.

Well, we have wal_level to control the amount of WAL traffic.  It is
hard to imagine we are going to want to ship logical WAL information by
default, so most people will not be using logical WAL and would see a
benefit from an optimized WAL stream?  

What percentage is 8-bytes in a typical WAL record?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing size of WAL record headers

2013-01-10 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On Wed, Jan  9, 2013 at 05:06:49PM -0500, Tom Lane wrote:
 Let's wait till we see where the logical rep stuff ends up before we
 worry about saving 4 bytes per WAL record.

 Well, we have wal_level to control the amount of WAL traffic.

That's entirely irrelevant.  The point here is that we'll need more bits
to identify what any particular record is, unless we make a decision
that we'll have physically separate streams for logical replication
info, which doesn't sound terribly attractive; and in any case no such
decision has been made yet, AFAIK.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing size of WAL record headers

2013-01-10 Thread Simon Riggs
On 10 January 2013 20:13, Tom Lane t...@sss.pgh.pa.us wrote:
 Bruce Momjian br...@momjian.us writes:
 On Wed, Jan  9, 2013 at 05:06:49PM -0500, Tom Lane wrote:
 Let's wait till we see where the logical rep stuff ends up before we
 worry about saving 4 bytes per WAL record.

 Well, we have wal_level to control the amount of WAL traffic.

 That's entirely irrelevant.  The point here is that we'll need more bits
 to identify what any particular record is, unless we make a decision
 that we'll have physically separate streams for logical replication
 info, which doesn't sound terribly attractive; and in any case no such
 decision has been made yet, AFAIK.

You were right to say that this is less important than logical
replication. I don't need any more reason than that to stop talking
about it.

I have a patch for this, but as yet no way to submit it while at the
same time saying put this at the back of the queue.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing size of WAL record headers

2013-01-09 Thread Heikki Linnakangas

On 09.01.2013 22:36, Simon Riggs wrote:

Overall, the WAL record is MAXALIGN'd, so with 8 byte alignment we
waste 4 bytes per record. Or put another way, if we could reduce
record header by 4 bytes, we would actually reduce it by 8 bytes per
record. So looking for ways to do that seems like a good idea.


Agreed.


The WAL record header starts with xl_tot_len, a 4 byte field. There is
also another field, xl_len. The difference is that xl_tot_len includes
the header, xl_len and any backup blocks. Since the header is fixed,
the only time xl_tot_len != SizeOfXLogRecord + xl_len is when we have
backup blocks.

We can re-arrange the record layout so that we remove xl_tot_len and
add another (maxaligned) 4 byte field (--  8 bytes) after the record
header, xl_bkpblock_len that only exists if we have backup blocks.
This will then save 8 bytes from every record that doesn't have backup
blocks, and be the same as now with backup blocks.


Here's a better idea:

Let's keep xl_tot_len as it is, but move xl_len at the very end of the 
WAL record, after all the backup blocks. If there are no backup blocks, 
xl_len is omitted. Seems more robust to keep xl_tot_len, so that you 
require less math to figure out where one record ends and where the next 
one begins.



Forcing the XLogRecord header to be all on one page makes the format
more robust and simplifies the code that copes with header wrapping.


-1 on that. That would essentially revert the changes I made earlier. 
The purpose of allowing the header to be wrapped was that you could 
easily calculate ahead of time exactly how much space a WAL record 
takes. My motivation for that was the XLogInsert scaling patch. Now, I 
admit I haven't had a chance to work further on that patch, so we're not 
gaining much from the format change at the moment. Nevertheless, I don't 
want us to get back to the situation that you sometimes need to add 
padding to the end of a WAL page.


My suggestion above to keep xl_tot_len and remove xl_len from XLogRecord 
doesn't have a problem with crossing page boundaries.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing size of WAL record headers

2013-01-09 Thread Bruce Momjian
On Wed, Jan  9, 2013 at 10:54:33PM +0200, Heikki Linnakangas wrote:
 On 09.01.2013 22:36, Simon Riggs wrote:
 Overall, the WAL record is MAXALIGN'd, so with 8 byte alignment we
 waste 4 bytes per record. Or put another way, if we could reduce
 record header by 4 bytes, we would actually reduce it by 8 bytes per
 record. So looking for ways to do that seems like a good idea.
 
 Agreed.
 
 The WAL record header starts with xl_tot_len, a 4 byte field. There is
 also another field, xl_len. The difference is that xl_tot_len includes
 the header, xl_len and any backup blocks. Since the header is fixed,
 the only time xl_tot_len != SizeOfXLogRecord + xl_len is when we have
 backup blocks.
 
 We can re-arrange the record layout so that we remove xl_tot_len and
 add another (maxaligned) 4 byte field (--  8 bytes) after the record
 header, xl_bkpblock_len that only exists if we have backup blocks.
 This will then save 8 bytes from every record that doesn't have backup
 blocks, and be the same as now with backup blocks.
 
 Here's a better idea:
 
 Let's keep xl_tot_len as it is, but move xl_len at the very end of
 the WAL record, after all the backup blocks. If there are no backup
 blocks, xl_len is omitted. Seems more robust to keep xl_tot_len, so
 that you require less math to figure out where one record ends and
 where the next one begins.

OK, crazy idea, but can we just record xl_len as a difference against
xl_tot_len, and shorten the xl_len field?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing size of WAL record headers

2013-01-09 Thread Heikki Linnakangas

On 09.01.2013 22:59, Bruce Momjian wrote:

On Wed, Jan  9, 2013 at 10:54:33PM +0200, Heikki Linnakangas wrote:

On 09.01.2013 22:36, Simon Riggs wrote:

The WAL record header starts with xl_tot_len, a 4 byte field. There is
also another field, xl_len. The difference is that xl_tot_len includes
the header, xl_len and any backup blocks. Since the header is fixed,
the only time xl_tot_len != SizeOfXLogRecord + xl_len is when we have
backup blocks.

We can re-arrange the record layout so that we remove xl_tot_len and
add another (maxaligned) 4 byte field (--   8 bytes) after the record
header, xl_bkpblock_len that only exists if we have backup blocks.
This will then save 8 bytes from every record that doesn't have backup
blocks, and be the same as now with backup blocks.


Here's a better idea:

Let's keep xl_tot_len as it is, but move xl_len at the very end of
the WAL record, after all the backup blocks. If there are no backup
blocks, xl_len is omitted. Seems more robust to keep xl_tot_len, so
that you require less math to figure out where one record ends and
where the next one begins.


OK, crazy idea, but can we just record xl_len as a difference against
xl_tot_len, and shorten the xl_len field?


Hmm, so it would essentially be the length of all the backup blocks. 
perhaps rename it to xl_bkpblk_len.


However, that would cap the total size of backup blocks to 64k. Which 
would not be enough with 32k BLCKSZ.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing size of WAL record headers

2013-01-09 Thread Simon Riggs
On 9 January 2013 21:02, Heikki Linnakangas hlinnakan...@vmware.com wrote:

 OK, crazy idea, but can we just record xl_len as a difference against
 xl_tot_len, and shorten the xl_len field?


 Hmm, so it would essentially be the length of all the backup blocks. perhaps
 rename it to xl_bkpblk_len.

 However, that would cap the total size of backup blocks to 64k. Which would
 not be enough with 32k BLCKSZ.

Since that requires a recompile anyway, why not make XLogRecord
smaller only for 16k BLCKSZ or less?

Problem if we do that is that xl_len is used extensively in _redo
routines, so its a much more invasive patch.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing size of WAL record headers

2013-01-09 Thread Simon Riggs
On 9 January 2013 20:54, Heikki Linnakangas hlinnakan...@vmware.com wrote:

 Here's a better idea:

 Let's keep xl_tot_len as it is, but move xl_len at the very end of the WAL
 record, after all the backup blocks. If there are no backup blocks, xl_len
 is omitted. Seems more robust to keep xl_tot_len, so that you require less
 math to figure out where one record ends and where the next one begins.

OK, I avoided tampering with xl_len cos its so widely used. Will look.

 Forcing the XLogRecord header to be all on one page makes the format
 more robust and simplifies the code that copes with header wrapping.

 -1 on that. That would essentially revert the changes I made earlier.

OK, idea dropped.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing size of WAL record headers

2013-01-09 Thread Bruce Momjian
On Wed, Jan  9, 2013 at 09:15:16PM +, Simon Riggs wrote:
 On 9 January 2013 21:02, Heikki Linnakangas hlinnakan...@vmware.com wrote:
 
  OK, crazy idea, but can we just record xl_len as a difference against
  xl_tot_len, and shorten the xl_len field?
 
 
  Hmm, so it would essentially be the length of all the backup blocks. perhaps
  rename it to xl_bkpblk_len.
 
  However, that would cap the total size of backup blocks to 64k. Which would
  not be enough with 32k BLCKSZ.
 
 Since that requires a recompile anyway, why not make XLogRecord
 smaller only for 16k BLCKSZ or less?
 
 Problem if we do that is that xl_len is used extensively in _redo
 routines, so its a much more invasive patch.

I would just make it int16 on =16k block size, and int32 on 16k
blocks.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing size of WAL record headers

2013-01-09 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 Overall, the WAL record is MAXALIGN'd, so with 8 byte alignment we
 waste 4 bytes per record. Or put another way, if we could reduce
 record header by 4 bytes, we would actually reduce it by 8 bytes per
 record. So looking for ways to do that seems like a good idea.

I think this is extremely premature, in view of the ongoing discussions
about shoehorning logical replication and other kinds of data into the
WAL stream.  It seems quite likely that we'll end up eating some of
that padding space to support those features.  So whacking a lot of code
around in service of squeezing the existing padding out could very
easily end up being wasted work, in fact counterproductive if it
degrades either code readability or robustness.

Let's wait till we see where the logical rep stuff ends up before we
worry about saving 4 bytes per WAL record.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers