Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Zeugswetter Andreas OSB sIT
You don't want to just modify pg_standby to accept small files, because then you've made it harder to make absolutely sure when the file is ready to be processed if a non-atomic copy is being done. It is hard, but I think it is the right way forward. Anyway I think the size is not robust

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Bruce Momjian
Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Tom Lane wrote: Isn't this redundant given the existence of pglesslog? It does the same as pglesslog, but is simpler to use because it is automatic. Which also means that everyone pays the performance penalty whether they get

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Kevin Grittner
Greg Smith gsm...@gregsmith.com wrote: I thought at one point that the direction this was going toward was to provide the size of the WAL file as a parameter you can use in the archive_command: %p provides the path, %f the file name, and now %l the length. That makes an example archive

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Tom Lane
Bruce Momjian br...@momjian.us writes: Tom Lane wrote: Which also means that everyone pays the performance penalty whether they get any benefit or not. The point of the external solution is to do the work only in installations that get some benefit. We've been over this ground before... If

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes: Greg Smith gsm...@gregsmith.com wrote: I thought at one point that the direction this was going toward was to provide the size of the WAL file as a parameter you can use in the archive_command: Hard to beat for performance. I thought

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Aidan Van Dyk
All that is useless until we get a %l in archive_command... *I* didn't see an easy way to get at the written size later on in the chain (i.e. in the actual archiving), so I took the path of least resitance. The reason *I* shy way from pg_lesslog and pg_clearxlogtail, is that they seem to

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Simon Riggs
On Fri, 2009-01-09 at 09:31 -0500, Bruce Momjian wrote: Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Tom Lane wrote: Isn't this redundant given the existence of pglesslog? It does the same as pglesslog, but is simpler to use because it is automatic. Which also

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Aidan Van Dyk
* Simon Riggs si...@2ndquadrant.com [090109 11:33]: The patch as stands is IMHO not acceptable because the work to zero the file is performed by the unlucky backend that hits EOF on the current WAL file, which is bad enough, but it is also performed while holding WALWriteLock. Agreed,

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Kevin Grittner
Aidan Van Dyk ai...@highrise.ca 01/09/09 10:22 AM The reason *I* shy way from pg_lesslog and pg_clearxlogtail, is that they seem to possibly be frail... I'm just scared of somethign changing in PG some time, and my pg_clearxlogtail not nowing, me forgetting to upgrade, and me not doing

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Richard Huxton
Tom Lane wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: Greg Smith gsm...@gregsmith.com wrote: I thought at one point that the direction this was going toward was to provide the size of the WAL file as a parameter you can use in the archive_command: Hard to beat for

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Aidan Van Dyk
* Richard Huxton d...@archonet.com [090109 12:22]: Yeah: the archiver process doesn't have that information available. Am I being really dim here - why isn't the first record in the WAL file a fixed-length record containing e.g. txid_start, time_start, txid_end, time_end, length? Write it

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Bruce Momjian
Tom Lane wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: Greg Smith gsm...@gregsmith.com wrote: I thought at one point that the direction this was going toward was to provide the size of the WAL file as a parameter you can use in the archive_command: Hard to beat for

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Richard Huxton
Aidan Van Dyk wrote: * Richard Huxton d...@archonet.com [090109 12:22]: Yeah: the archiver process doesn't have that information available. Am I being really dim here - why isn't the first record in the WAL file a fixed-length record containing e.g. txid_start, time_start, txid_end,

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes: Yes, we could make the archiver do this, but I see no big advantage over having it done externally. It's not faster, safer, easier. Not easier because we would want a parameter to turn it off when not wanted. And the other question to ask is how much

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: AFAICS, file-at-a-time WAL shipping is a stopgap implementation that will be dead as a doornail once the current efforts towards realtime replication are finished. As long as there is a way to rsync log data to multiple targets not running replicas, with

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Simon Riggs
On Fri, 2009-01-09 at 13:22 -0500, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: Yes, we could make the archiver do this, but I see no big advantage over having it done externally. It's not faster, safer, easier. Not easier because we would want a parameter to turn it off when

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Greg Smith
On Fri, 9 Jan 2009, Simon Riggs wrote: Half-filled WAL files were necessary to honour archive_timeout. With continuous streaming all WAL files will be 100% full before we switch, for most purposes. The main use case I'm concerned about losing support for is: 1) Two systems connected by a WAN

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Kevin Grittner
Greg Smith gsm...@gregsmith.com wrote: The main use case I'm concerned about losing support for is: 1) Two systems connected by a WAN with significant transmit latency 2) The secondary system runs a warm standby aimed at disaster recovery 3) Business requirements want the standby to never

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Greg Smith
On Fri, 9 Jan 2009, Aidan Van Dyk wrote: *I* didn't see an easy way to get at the written size later on in the chain (i.e. in the actual archiving), so I took the path of least resitance. I was hoping it might fall out of the other work being done in that area, given how much that code is

Re: [HACKERS] Improving compressibility of WAL files

2009-01-09 Thread Aidan Van Dyk
* Greg Smith gsm...@gregsmith.com [090109 18:39]: I was hoping it might fall out of the other work being done in that area, given how much that code is still being poked at right now. As Hannu pointed out, from a conceptual level you just need to carry along the same information that

[HACKERS] Improving compressibility of WAL files

2009-01-08 Thread Bruce Momjian
The attached patch from Aidan Van Dyk zeros out the end of WAL files to improve their compressibility. (The patch was originally sent to 'general' which explains why it was lost until now.) Would someone please eyeball it?; it is useful for compressing PITR logs even if we find a better

Re: [HACKERS] Improving compressibility of WAL files

2009-01-08 Thread Tom Lane
Bruce Momjian br...@momjian.us writes: The attached patch from Aidan Van Dyk zeros out the end of WAL files to improve their compressibility. (The patch was originally sent to 'general' which explains why it was lost until now.) Isn't this redundant given the existence of pglesslog?

Re: [HACKERS] Improving compressibility of WAL files

2009-01-08 Thread Aidan Van Dyk
* Bruce Momjian br...@momjian.us [090108 16:43]: The attached patch from Aidan Van Dyk zeros out the end of WAL files to improve their compressibility. (The patch was originally sent to 'general' which explains why it was lost until now.) Would someone please eyeball it?; it is useful for

Re: [HACKERS] Improving compressibility of WAL files

2009-01-08 Thread Kevin Grittner
Aidan Van Dyk ai...@highrise.ca 01/08/09 5:02 PM *I* would really like this wal zero'ing... pg_clearxlogtail (in pgfoundry) does exactly the same zeroing of the tail as a filter. If you pipe through it on the way to gzip, there is no increase in disk I/O over a straight gzip, and often an

Re: [HACKERS] Improving compressibility of WAL files

2009-01-08 Thread Hannu Krosing
On Thu, 2009-01-08 at 18:02 -0500, Aidan Van Dyk wrote: * Bruce Momjian br...@momjian.us [090108 16:43]: The attached patch from Aidan Van Dyk zeros out the end of WAL files to improve their compressibility. (The patch was originally sent to 'general' which explains why it was lost until

Re: [HACKERS] Improving compressibility of WAL files

2009-01-08 Thread Hannu Krosing
On Fri, 2009-01-09 at 01:29 +0200, Hannu Krosing wrote: On Thu, 2009-01-08 at 18:02 -0500, Aidan Van Dyk wrote: ... There's possible a few other ways to do it, such as zero the WAL on recycling (but not fsyncing it), and hopefully most of the zero's get trickled out by the OS before it

Re: [HACKERS] Improving compressibility of WAL files

2009-01-08 Thread Greg Smith
On Fri, 9 Jan 2009, Hannu Krosing wrote: won't it still be easier/less intrusive on inline core functionality and more flexible to just record end-of-valid-wal somewhere and then let the compressor discard the invalid part when compressing and recreate it with zeros on decompression ? I

Re: [HACKERS] Improving compressibility of WAL files

2009-01-08 Thread Bruce Momjian
Tom Lane wrote: Bruce Momjian br...@momjian.us writes: The attached patch from Aidan Van Dyk zeros out the end of WAL files to improve their compressibility. (The patch was originally sent to 'general' which explains why it was lost until now.) Isn't this redundant given the existence of

Re: [HACKERS] Improving compressibility of WAL files

2009-01-08 Thread Tom Lane
Bruce Momjian br...@momjian.us writes: Tom Lane wrote: Isn't this redundant given the existence of pglesslog? It does the same as pglesslog, but is simpler to use because it is automatic. Which also means that everyone pays the performance penalty whether they get any benefit or not. The