On Tue, 2007-04-10 at 16:23 +0900, Koichi Suzuki wrote:
Here're two patches for
1) lesslog_core.patch, patch for core, to set a mark to the log record
to be removed in archiving,
2) lesslog_contrib.patch, patch for contrib/lesslog, pg_compresslog and
pg_decompresslog,
respectively,
I really appreciate for the modification.
I also believe XLOG_NOOP is cool to maintains XLOG format consistent.
I'll continue to write a code to produce incremental log record from
the full page writes as well as too maintain CRC, XLOOG_NOOP and
other XLOG locations,I also found that you've
Koichi Suzuki [EMAIL PROTECTED] writes:
As replied to Patch queue triage by Tom, here's simplified patch to
mark WAL record as compressable, with no increase in WAL itself.
Compression/decompression commands will be posted separately to PG
Foundary for further review.
Applied with some minor
Your patch has been added to the PostgreSQL unapplied patches list at:
http://momjian.postgresql.org/cgi-bin/pgpatches
It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.
---
Hi,
As replied to Patch queue triage by Tom, here's simplified patch to
mark WAL record as compressable, with no increase in WAL itself.
Compression/decompression commands will be posted separately to PG
Foundary for further review.
---
As suggested by Tom, I agree
Josh,
Josh Berkus wrote:
Koichi, Andreas,
1) To deal with partial/inconsisitent write to the data file at crash
recovery, we need full page writes at the first modification to pages
after each checkpoint. It consumes much of WAL space.
We need to find a way around this someday. Other DBs
1) To deal with partial/inconsisitent write to the data file at
crash
recovery, we need full page writes at the first modification to
pages
after each checkpoint. It consumes much of WAL space.
We need to find a way around this someday. Other DBs don't
do this; it may be becuase
On Wed, Apr 25, 2007 at 10:00:16AM +0200, Zeugswetter Andreas ADI SD wrote:
1) To deal with partial/inconsisitent write to the data file at
crash
recovery, we need full page writes at the first modification to
pages
after each checkpoint. It consumes much of WAL space.
We need
Andreas,
Writing to a different area was considered in pg, but there were more
negative issues than positive.
So imho pg_compresslog is the correct path forward. The current
discussion is only about whether we want a more complex pg_compresslog
and no change to current WAL, or an increased
Josh Berkus [EMAIL PROTECTED] writes:
Andreas,
So imho pg_compresslog is the correct path forward. The current
discussion is only about whether we want a more complex pg_compresslog
and no change to current WAL, or an increased WAL size for a less
complex implementation.
Both would be able
Hi,
Zeugswetter Andreas ADI SD wrote:
I don't insist the name and the default of the GUC parameter.
I'm afraid wal_fullpage_optimization = on (default) makes
some confusion because the default behavior becomes a bit
different on WAL itself.
Seems my wal_fullpage_optimization is not a good
3) To maintain crash recovery chance and reduce the amount of
archive log, removal of unnecessary full page writes from
archive logs is a good choice.
Definitely, yes. pg_compresslog could even move the full pages written
during backup out of WAL and put them in a different file that needs
Koichi, Andreas,
1) To deal with partial/inconsisitent write to the data file at crash
recovery, we need full page writes at the first modification to pages
after each checkpoint. It consumes much of WAL space.
We need to find a way around this someday. Other DBs don't do this; it may be
Josh Berkus [EMAIL PROTECTED] writes:
Well, as a PG hacker I find the name wal_fullpage_optimization quite
baffling and I think our general user base will find it even more so.
Now that I have Koichi's explanation of the problem, I vote for simply
slaving this to the PITR settings and not
I don't insist the name and the default of the GUC parameter.
I'm afraid wal_fullpage_optimization = on (default) makes
some confusion because the default behavior becomes a bit
different on WAL itself.
Seems my wal_fullpage_optimization is not a good name if it caused
misinterpretation
Hackers,
Writing lots of additional code simply to remove a parameter that
*might* be mis-interpreted doesn't sound useful to me, especially when
bugs may leak in that way. My take is that this is simple and useful
*and* we have it now; other ways don't yet exist, nor will they in time
for
Hi,
Sorry, because of so many comments/questions, I'll write inline
Josh Berkus wrote:
Hackers,
Writing lots of additional code simply to remove a parameter that
*might* be mis-interpreted doesn't sound useful to me, especially when
bugs may leak in that way. My take is that this is
Hi,
I don't insist the name and the default of the GUC parameter. I'm
afraid wal_fullpage_optimization = on (default) makes some confusion
because the default behavior becomes a bit different on WAL itself.
I'd like to have some more opinion on this.
Zeugswetter Andreas ADI SD wrote:
With
Sorry I was very late to find this.
With DBT-2 benchmark, I've already compared the amount of WAL. The
result was as follows:
Amount of WAL after 60min. run of DBT-2 benchmark
wal_add_optimization_info = off (default) 3.13GB
wal_add_optimization_info = on (new case) 3.17GB - can be
With DBT-2 benchmark, I've already compared the amount of WAL. The
result was as follows:
Amount of WAL after 60min. run of DBT-2 benchmark
wal_add_optimization_info = off (default) 3.13GB
how about wal_fullpage_optimization = on (default)
wal_add_optimization_info = on (new case)
Hi,
I agree that pg_compresslog should be aware of all the WAL records'
details so that it can optimize archive log safely. In my patch, I've
examined 8.2's WAL records to make pg_compresslog/pg_decompresslog safe.
Also I agree further pg_compresslog maintenance needs to examine changes
Here's only a part of the reply I should do, but as to I/O error
checking ...
Here's a list of system calls and other external function/library calls
used in pg_lesslog patch series, together with how current patch checks
each errors and how current postgresql source handles the similar calls:
On Fri, 2007-04-20 at 10:16 +0200, Zeugswetter Andreas ADI SD wrote:
Your work in this area is extremely valuable and I hope my comments are
not discouraging.
I think its too late in the day to make the changes suggested by
yourself and Tom. They make the patch more invasive and more likely to
Yup, this is a good summary.
You say you need to remove the optimization that avoids the logging
of
a new tuple because the full page image exists.
I think we must already have the info in WAL which tuple inside the
full page image is new (the one for which we avoided the WAL entry
Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes:
But you also turn off the optimization that avoids writing regular
WAL records when the info is already contained in a full-page image
(increasing the uncompressed size of WAL).
It was that part I questioned.
That's what bothers me about
On Fri, 2007-04-13 at 10:36 -0400, Tom Lane wrote:
Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes:
But you also turn off the optimization that avoids writing regular
WAL records when the info is already contained in a full-page image
(increasing the uncompressed size of WAL).
It was
Simon Riggs [EMAIL PROTECTED] writes:
On Fri, 2007-04-13 at 10:36 -0400, Tom Lane wrote:
That's what bothers me about this patch, too. It will be increasing
the cost of writing WAL (more data - more CRC computation and more
I/O, not to mention more contention for the WAL locks) which
On Fri, 2007-04-13 at 11:47 -0400, Tom Lane wrote:
Simon Riggs [EMAIL PROTECTED] writes:
On Fri, 2007-04-13 at 10:36 -0400, Tom Lane wrote:
That's what bothers me about this patch, too. It will be increasing
the cost of writing WAL (more data - more CRC computation and more
I/O, not to
Simon Riggs [EMAIL PROTECTED] writes:
Writing lots of additional code simply to remove a parameter that
*might* be mis-interpreted doesn't sound useful to me, especially when
bugs may leak in that way. My take is that this is simple and useful
*and* we have it now; other ways don't yet exist,
I don't fully understand what transaction log means. If it means
archived WAL, the current (8.2) code handle WAL as follows:
Probably we can define transaction log to be the part of WAL that is
not
full pages.
1) If full_page_writes=off, then no full page writes will be
written to WAL,
Hi,
Sorry, inline reply.
Zeugswetter Andreas ADI SD wrote:
Yup, this is a good summary.
You say you need to remove the optimization that avoids
the logging of a new tuple because the full page image exists.
I think we must already have the info in WAL which tuple inside the full
page
Koichi Suzuki [EMAIL PROTECTED] writes:
For more information, when checkpoint interval is one hour, the amount
of the archived log size was as follows:
cp: 3.1GB
gzip: 1.5GB
pg_compresslog: 0.3GB
The notion that 90% of the WAL could be backup blocks even at very long
The score below was taken based on 8.2 code, not 8.3 code. So I don't
think the below measure is introduced only in 8.3 code.
Tom Lane wrote:
Koichi Suzuki [EMAIL PROTECTED] writes:
For more information, when checkpoint interval is one hour, the amount
of the archived log size was as follows:
I don't fully understand what transaction log means. If it means
archived WAL, the current (8.2) code handle WAL as follows:
1) If full_page_writes=off, then no full page writes will be written to
WAL, except for those during onlie backup (between pg_start_backup and
pg_stop_backup). The
Koichi Suzuki [EMAIL PROTECTED] writes:
My proposal is to remove unnecessary full page writes (they are needed
in crash recovery from inconsistent or partial writes) when we copy WAL
to archive log and rebuilt them as a dummy when we restore from archive
log.
...
Benchmark: DBT-2
Hi,
In the case below, we run DBT-2 benchmark for one hour to get the
measure. Checkpoint occured three times (checkpoint interval was 20min).
For more information, when checkpoint interval is one hour, the amount
of the archived log size was as follows:
cp: 3.1GB
gzip:
In terms of idle time for gzip and other command to archive WAL offline,
no difference in the environment was given other than the command to
archive. My guess is because the user time is very large in gzip, it
has more chance for scheduler to give resource to other processes. In
the
Ühel kenal päeval, T, 2007-04-10 kell 18:17, kirjutas Joshua D. Drake:
In terms of idle time for gzip and other command to archive WAL offline,
no difference in the environment was given other than the command to
archive. My guess is because the user time is very large in gzip, it
has
On Tue, 2007-04-03 at 19:45 +0900, Koichi Suzuki wrote:
Bruce Momjian wrote:
Your patch has been added to the PostgreSQL unapplied patches list at:
http://momjian.postgresql.org/cgi-bin/pgpatches
Thank you very much for including. Attached is an update of the patch
according to
Hi,
I agree to put the patch to core and the others (pg_compresslog and
pg_decompresslog) to contrib/lesslog.
I will make separate materials to go to core and contrib.
As for patches, we have tested against pgbench, DBT-2 and our
propriatery benchmarks and it looked to run correctly.
Here's third revision of WAL archival optimization patch. GUC
parameter name was changed to wal_add_optimization_info.
Regards;
--
Koichi Suzuki
20070403_pg_lesslog.tar.gz
Description: application/gzip
---(end of broadcast)---
TIP 1: if
Bruce Momjian wrote:
Your patch has been added to the PostgreSQL unapplied patches list at:
http://momjian.postgresql.org/cgi-bin/pgpatches
Thank you very much for including. Attached is an update of the patch
according to Simon Riggs's comment about GUC name.
Regards;
--
Koichi
Your patch has been added to the PostgreSQL unapplied patches list at:
http://momjian.postgresql.org/cgi-bin/pgpatches
It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.
---
Your patch has been added to the PostgreSQL unapplied patches list at:
http://momjian.postgresql.org/cgi-bin/pgpatches
It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.
---
Tom Lane wrote:
Simon Riggs [EMAIL PROTECTED] writes:
Any page written during a backup has a backup block that would not be
removable by Koichi's tool, so yes, you'd still be safe.
How does it know not to do that?
regards, tom lane
---(end
Without a switch, because both full page writes and
corresponding logical log is included in WAL, this will
increase WAL size slightly
(maybe about five percent or so). If everybody is happy
with this, we
don't need a switch.
Sorry, I still don't understand that. What is the
On Fri, 2007-03-30 at 10:22 +0200, Zeugswetter Andreas ADI SD wrote:
Without a switch, because both full page writes and
corresponding logical log is included in WAL, this will
increase WAL size slightly
(maybe about five percent or so). If everybody is happy
with this, we
don't
Archive recovery needs the
normal xlog record, which in some cases has been optimised
away because the backup block is present, since the full
block already contains the changes.
Aah, I didn't know that optimization exists.
I agree that removing that optimization is good/ok.
Andreas
Simon Riggs wrote:
On Fri, 2007-03-30 at 10:22 +0200, Zeugswetter Andreas ADI SD wrote:
Without a switch, because both full page writes and
corresponding logical log is included in WAL, this will
increase WAL size slightly
(maybe about five percent or so). If everybody is happy
with this,
On Fri, 2007-03-30 at 11:27 +0100, Richard Huxton wrote:
Is that always true? Could the backup not pick up a partially-written
page? Assuming it's being written to as the backup is in progress. (We
are talking about when disk blocks are smaller than PG blocks here, so
can't guarantee an
Simon Riggs wrote:
On Fri, 2007-03-30 at 11:27 +0100, Richard Huxton wrote:
Is that always true? Could the backup not pick up a partially-written
page? Assuming it's being written to as the backup is in progress. (We
are talking about when disk blocks are smaller than PG blocks here, so
Simon Riggs [EMAIL PROTECTED] writes:
Any page written during a backup has a backup block that would not be
removable by Koichi's tool, so yes, you'd still be safe.
How does it know not to do that?
regards, tom lane
---(end of
On Fri, 2007-03-30 at 16:35 -0400, Tom Lane wrote:
Simon Riggs [EMAIL PROTECTED] writes:
Any page written during a backup has a backup block that would not be
removable by Koichi's tool, so yes, you'd still be safe.
How does it know not to do that?
Not sure what you mean, but I'll take a
Hi, Here're some feedback to the comment:
Simon Riggs wrote:
On Wed, 2007-03-28 at 10:54 +0900, Koichi Suzuki wrote:
As written below, full page write can be
categolized as follows:
1) Needed for crash recovery: first page update after each checkpoint.
This has to be kept in WAL.
2) Needed
On Thu, 2007-03-29 at 17:50 +0900, Koichi Suzuki wrote:
Not only full-page-writes are written as WAL record. In my proposal,
both full-page-writes and logical log are written in a WAL record, which
will make WAL size slightly bigger (five percent or so). If
full_page_compress = off,
Simon,
OK, different question:
Why would anyone ever set full_page_compress = off?
The only reason I can see is if compression costs us CPU but gains RAM
I/O. I can think of a lot of applications ... benchmarks included ...
which are CPU-bound but not RAM or I/O bound. For those
On Thu, 2007-03-29 at 11:45 -0700, Josh Berkus wrote:
OK, different question:
Why would anyone ever set full_page_compress = off?
The only reason I can see is if compression costs us CPU but gains RAM
I/O. I can think of a lot of applications ... benchmarks included ...
which are
Josh;
I'd like to explain what the term compression in my proposal means
again and would like to show the resource consumption comparision with
cp and gzip.
My proposal is to remove unnecessary full page writes (they are needed
in crash recovery from inconsistent or partial writes) when we
Hi,
Here's a patch reflected some of Simon's comments.
1) Removed an elog call in a critical section.
2) Changed the name of the commands, pg_complesslog and pg_decompresslog.
3) Changed diff option to make a patch.
--
Koichi Suzuki
pg_lesslog.tgz
Description: Binary data
On Wed, 2007-03-28 at 10:54 +0900, Koichi Suzuki wrote:
As written below, full page write can be
categolized as follows:
1) Needed for crash recovery: first page update after each checkpoint.
This has to be kept in WAL.
2) Needed for archive recovery: page update between pg_start_backup
Simon;
Thanks a lot for your comments/advices. I'd like to write some feedback.
Simon Riggs wrote:
On Tue, 2007-03-27 at 11:52 +0900, Koichi Suzuki wrote:
Here's an update of a code to improve full page writes as proposed in
http://archives.postgresql.org/pgsql-hackers/2007-01/msg01491.php
61 matches
Mail list logo