Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-09-25 Thread Simon Riggs
On Tue, 2007-04-10 at 16:23 +0900, Koichi Suzuki wrote: Here're two patches for 1) lesslog_core.patch, patch for core, to set a mark to the log record to be removed in archiving, 2) lesslog_contrib.patch, patch for contrib/lesslog, pg_compresslog and pg_decompresslog, respectively,

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-05-21 Thread Koichi Suzuki
I really appreciate for the modification. I also believe XLOG_NOOP is cool to maintains XLOG format consistent. I'll continue to write a code to produce incremental log record from the full page writes as well as too maintain CRC, XLOOG_NOOP and other XLOG locations,I also found that you've

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-05-20 Thread Tom Lane
Koichi Suzuki [EMAIL PROTECTED] writes: As replied to Patch queue triage by Tom, here's simplified patch to mark WAL record as compressable, with no increase in WAL itself. Compression/decompression commands will be posted separately to PG Foundary for further review. Applied with some minor

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-05-16 Thread Bruce Momjian
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. ---

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-05-07 Thread Koichi Suzuki
Hi, As replied to Patch queue triage by Tom, here's simplified patch to mark WAL record as compressable, with no increase in WAL itself. Compression/decompression commands will be posted separately to PG Foundary for further review. --- As suggested by Tom, I agree

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-26 Thread Koichi Suzuki
Josh, Josh Berkus wrote: Koichi, Andreas, 1) To deal with partial/inconsisitent write to the data file at crash recovery, we need full page writes at the first modification to pages after each checkpoint. It consumes much of WAL space. We need to find a way around this someday. Other DBs

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-25 Thread Zeugswetter Andreas ADI SD
1) To deal with partial/inconsisitent write to the data file at crash recovery, we need full page writes at the first modification to pages after each checkpoint. It consumes much of WAL space. We need to find a way around this someday. Other DBs don't do this; it may be becuase

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-25 Thread Kenneth Marshall
On Wed, Apr 25, 2007 at 10:00:16AM +0200, Zeugswetter Andreas ADI SD wrote: 1) To deal with partial/inconsisitent write to the data file at crash recovery, we need full page writes at the first modification to pages after each checkpoint. It consumes much of WAL space. We need

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-25 Thread Josh Berkus
Andreas, Writing to a different area was considered in pg, but there were more negative issues than positive. So imho pg_compresslog is the correct path forward. The current discussion is only about whether we want a more complex pg_compresslog and no change to current WAL, or an increased

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-25 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes: Andreas, So imho pg_compresslog is the correct path forward. The current discussion is only about whether we want a more complex pg_compresslog and no change to current WAL, or an increased WAL size for a less complex implementation. Both would be able

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-25 Thread Koichi Suzuki
Hi, Zeugswetter Andreas ADI SD wrote: I don't insist the name and the default of the GUC parameter. I'm afraid wal_fullpage_optimization = on (default) makes some confusion because the default behavior becomes a bit different on WAL itself. Seems my wal_fullpage_optimization is not a good

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-24 Thread Zeugswetter Andreas ADI SD
3) To maintain crash recovery chance and reduce the amount of archive log, removal of unnecessary full page writes from archive logs is a good choice. Definitely, yes. pg_compresslog could even move the full pages written during backup out of WAL and put them in a different file that needs

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-24 Thread Josh Berkus
Koichi, Andreas, 1) To deal with partial/inconsisitent write to the data file at crash recovery, we need full page writes at the first modification to pages after each checkpoint. It consumes much of WAL space. We need to find a way around this someday. Other DBs don't do this; it may be

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-24 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes: Well, as a PG hacker I find the name wal_fullpage_optimization quite baffling and I think our general user base will find it even more so. Now that I have Koichi's explanation of the problem, I vote for simply slaving this to the PITR settings and not

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-23 Thread Zeugswetter Andreas ADI SD
I don't insist the name and the default of the GUC parameter. I'm afraid wal_fullpage_optimization = on (default) makes some confusion because the default behavior becomes a bit different on WAL itself. Seems my wal_fullpage_optimization is not a good name if it caused misinterpretation

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-23 Thread Josh Berkus
Hackers, Writing lots of additional code simply to remove a parameter that *might* be mis-interpreted doesn't sound useful to me, especially when bugs may leak in that way. My take is that this is simple and useful *and* we have it now; other ways don't yet exist, nor will they in time for

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-23 Thread Koichi Suzuki
Hi, Sorry, because of so many comments/questions, I'll write inline Josh Berkus wrote: Hackers, Writing lots of additional code simply to remove a parameter that *might* be mis-interpreted doesn't sound useful to me, especially when bugs may leak in that way. My take is that this is

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-22 Thread Koichi Suzuki
Hi, I don't insist the name and the default of the GUC parameter. I'm afraid wal_fullpage_optimization = on (default) makes some confusion because the default behavior becomes a bit different on WAL itself. I'd like to have some more opinion on this. Zeugswetter Andreas ADI SD wrote: With

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-20 Thread Koichi Suzuki
Sorry I was very late to find this. With DBT-2 benchmark, I've already compared the amount of WAL. The result was as follows: Amount of WAL after 60min. run of DBT-2 benchmark wal_add_optimization_info = off (default) 3.13GB wal_add_optimization_info = on (new case) 3.17GB - can be

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-20 Thread Zeugswetter Andreas ADI SD
With DBT-2 benchmark, I've already compared the amount of WAL. The result was as follows: Amount of WAL after 60min. run of DBT-2 benchmark wal_add_optimization_info = off (default) 3.13GB how about wal_fullpage_optimization = on (default) wal_add_optimization_info = on (new case)

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-20 Thread Koichi Suzuki
Hi, I agree that pg_compresslog should be aware of all the WAL records' details so that it can optimize archive log safely. In my patch, I've examined 8.2's WAL records to make pg_compresslog/pg_decompresslog safe. Also I agree further pg_compresslog maintenance needs to examine changes

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-20 Thread Koichi Suzuki
Here's only a part of the reply I should do, but as to I/O error checking ... Here's a list of system calls and other external function/library calls used in pg_lesslog patch series, together with how current patch checks each errors and how current postgresql source handles the similar calls:

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-20 Thread Simon Riggs
On Fri, 2007-04-20 at 10:16 +0200, Zeugswetter Andreas ADI SD wrote: Your work in this area is extremely valuable and I hope my comments are not discouraging. I think its too late in the day to make the changes suggested by yourself and Tom. They make the patch more invasive and more likely to

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-13 Thread Zeugswetter Andreas ADI SD
Yup, this is a good summary. You say you need to remove the optimization that avoids the logging of a new tuple because the full page image exists. I think we must already have the info in WAL which tuple inside the full page image is new (the one for which we avoided the WAL entry

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-13 Thread Tom Lane
Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes: But you also turn off the optimization that avoids writing regular WAL records when the info is already contained in a full-page image (increasing the uncompressed size of WAL). It was that part I questioned. That's what bothers me about

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-13 Thread Simon Riggs
On Fri, 2007-04-13 at 10:36 -0400, Tom Lane wrote: Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes: But you also turn off the optimization that avoids writing regular WAL records when the info is already contained in a full-page image (increasing the uncompressed size of WAL). It was

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-13 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: On Fri, 2007-04-13 at 10:36 -0400, Tom Lane wrote: That's what bothers me about this patch, too. It will be increasing the cost of writing WAL (more data - more CRC computation and more I/O, not to mention more contention for the WAL locks) which

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-13 Thread Simon Riggs
On Fri, 2007-04-13 at 11:47 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: On Fri, 2007-04-13 at 10:36 -0400, Tom Lane wrote: That's what bothers me about this patch, too. It will be increasing the cost of writing WAL (more data - more CRC computation and more I/O, not to

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-13 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: Writing lots of additional code simply to remove a parameter that *might* be mis-interpreted doesn't sound useful to me, especially when bugs may leak in that way. My take is that this is simple and useful *and* we have it now; other ways don't yet exist,

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-12 Thread Zeugswetter Andreas ADI SD
I don't fully understand what transaction log means. If it means archived WAL, the current (8.2) code handle WAL as follows: Probably we can define transaction log to be the part of WAL that is not full pages. 1) If full_page_writes=off, then no full page writes will be written to WAL,

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-12 Thread Koichi Suzuki
Hi, Sorry, inline reply. Zeugswetter Andreas ADI SD wrote: Yup, this is a good summary. You say you need to remove the optimization that avoids the logging of a new tuple because the full page image exists. I think we must already have the info in WAL which tuple inside the full page

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-11 Thread Tom Lane
Koichi Suzuki [EMAIL PROTECTED] writes: For more information, when checkpoint interval is one hour, the amount of the archived log size was as follows: cp: 3.1GB gzip: 1.5GB pg_compresslog: 0.3GB The notion that 90% of the WAL could be backup blocks even at very long

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-11 Thread Koichi Suzuki
The score below was taken based on 8.2 code, not 8.3 code. So I don't think the below measure is introduced only in 8.3 code. Tom Lane wrote: Koichi Suzuki [EMAIL PROTECTED] writes: For more information, when checkpoint interval is one hour, the amount of the archived log size was as follows:

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-11 Thread Koichi Suzuki
I don't fully understand what transaction log means. If it means archived WAL, the current (8.2) code handle WAL as follows: 1) If full_page_writes=off, then no full page writes will be written to WAL, except for those during onlie backup (between pg_start_backup and pg_stop_backup). The

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-10 Thread Tom Lane
Koichi Suzuki [EMAIL PROTECTED] writes: My proposal is to remove unnecessary full page writes (they are needed in crash recovery from inconsistent or partial writes) when we copy WAL to archive log and rebuilt them as a dummy when we restore from archive log. ... Benchmark: DBT-2

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-10 Thread Koichi Suzuki
Hi, In the case below, we run DBT-2 benchmark for one hour to get the measure. Checkpoint occured three times (checkpoint interval was 20min). For more information, when checkpoint interval is one hour, the amount of the archived log size was as follows: cp: 3.1GB gzip:

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-10 Thread Joshua D. Drake
In terms of idle time for gzip and other command to archive WAL offline, no difference in the environment was given other than the command to archive. My guess is because the user time is very large in gzip, it has more chance for scheduler to give resource to other processes. In the

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-10 Thread Hannu Krosing
Ühel kenal päeval, T, 2007-04-10 kell 18:17, kirjutas Joshua D. Drake: In terms of idle time for gzip and other command to archive WAL offline, no difference in the environment was given other than the command to archive. My guess is because the user time is very large in gzip, it has

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-05 Thread Simon Riggs
On Tue, 2007-04-03 at 19:45 +0900, Koichi Suzuki wrote: Bruce Momjian wrote: Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches Thank you very much for including. Attached is an update of the patch according to

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-05 Thread Koichi Suzuki
Hi, I agree to put the patch to core and the others (pg_compresslog and pg_decompresslog) to contrib/lesslog. I will make separate materials to go to core and contrib. As for patches, we have tested against pgbench, DBT-2 and our propriatery benchmarks and it looked to run correctly.

Re: [PATCHES] [HACKERS] Full page writes improvement, code update again.

2007-04-03 Thread Koichi Suzuki
Here's third revision of WAL archival optimization patch. GUC parameter name was changed to wal_add_optimization_info. Regards; -- Koichi Suzuki 20070403_pg_lesslog.tar.gz Description: application/gzip ---(end of broadcast)--- TIP 1: if

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-03 Thread Koichi Suzuki
Bruce Momjian wrote: Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches Thank you very much for including. Attached is an update of the patch according to Simon Riggs's comment about GUC name. Regards; -- Koichi

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-03 Thread Bruce Momjian
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. ---

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-02 Thread Bruce Momjian
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. ---

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-01 Thread Koichi Suzuki
Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: Any page written during a backup has a backup block that would not be removable by Koichi's tool, so yes, you'd still be safe. How does it know not to do that? regards, tom lane ---(end

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-30 Thread Zeugswetter Andreas ADI SD
Without a switch, because both full page writes and corresponding logical log is included in WAL, this will increase WAL size slightly (maybe about five percent or so). If everybody is happy with this, we don't need a switch. Sorry, I still don't understand that. What is the

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-30 Thread Simon Riggs
On Fri, 2007-03-30 at 10:22 +0200, Zeugswetter Andreas ADI SD wrote: Without a switch, because both full page writes and corresponding logical log is included in WAL, this will increase WAL size slightly (maybe about five percent or so). If everybody is happy with this, we don't

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-30 Thread Zeugswetter Andreas ADI SD
Archive recovery needs the normal xlog record, which in some cases has been optimised away because the backup block is present, since the full block already contains the changes. Aah, I didn't know that optimization exists. I agree that removing that optimization is good/ok. Andreas

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-30 Thread Richard Huxton
Simon Riggs wrote: On Fri, 2007-03-30 at 10:22 +0200, Zeugswetter Andreas ADI SD wrote: Without a switch, because both full page writes and corresponding logical log is included in WAL, this will increase WAL size slightly (maybe about five percent or so). If everybody is happy with this,

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-30 Thread Simon Riggs
On Fri, 2007-03-30 at 11:27 +0100, Richard Huxton wrote: Is that always true? Could the backup not pick up a partially-written page? Assuming it's being written to as the backup is in progress. (We are talking about when disk blocks are smaller than PG blocks here, so can't guarantee an

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-30 Thread Richard Huxton
Simon Riggs wrote: On Fri, 2007-03-30 at 11:27 +0100, Richard Huxton wrote: Is that always true? Could the backup not pick up a partially-written page? Assuming it's being written to as the backup is in progress. (We are talking about when disk blocks are smaller than PG blocks here, so

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-30 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes: Any page written during a backup has a backup block that would not be removable by Koichi's tool, so yes, you'd still be safe. How does it know not to do that? regards, tom lane ---(end of

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-30 Thread Simon Riggs
On Fri, 2007-03-30 at 16:35 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: Any page written during a backup has a backup block that would not be removable by Koichi's tool, so yes, you'd still be safe. How does it know not to do that? Not sure what you mean, but I'll take a

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-29 Thread Koichi Suzuki
Hi, Here're some feedback to the comment: Simon Riggs wrote: On Wed, 2007-03-28 at 10:54 +0900, Koichi Suzuki wrote: As written below, full page write can be categolized as follows: 1) Needed for crash recovery: first page update after each checkpoint. This has to be kept in WAL. 2) Needed

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-29 Thread Simon Riggs
On Thu, 2007-03-29 at 17:50 +0900, Koichi Suzuki wrote: Not only full-page-writes are written as WAL record. In my proposal, both full-page-writes and logical log are written in a WAL record, which will make WAL size slightly bigger (five percent or so). If full_page_compress = off,

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-29 Thread Josh Berkus
Simon, OK, different question: Why would anyone ever set full_page_compress = off? The only reason I can see is if compression costs us CPU but gains RAM I/O. I can think of a lot of applications ... benchmarks included ... which are CPU-bound but not RAM or I/O bound. For those

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-29 Thread Simon Riggs
On Thu, 2007-03-29 at 11:45 -0700, Josh Berkus wrote: OK, different question: Why would anyone ever set full_page_compress = off? The only reason I can see is if compression costs us CPU but gains RAM I/O. I can think of a lot of applications ... benchmarks included ... which are

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-29 Thread Koichi Suzuki
Josh; I'd like to explain what the term compression in my proposal means again and would like to show the resource consumption comparision with cp and gzip. My proposal is to remove unnecessary full page writes (they are needed in crash recovery from inconsistent or partial writes) when we

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-29 Thread Koichi Suzuki
Hi, Here's a patch reflected some of Simon's comments. 1) Removed an elog call in a critical section. 2) Changed the name of the commands, pg_complesslog and pg_decompresslog. 3) Changed diff option to make a patch. -- Koichi Suzuki pg_lesslog.tgz Description: Binary data

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-28 Thread Simon Riggs
On Wed, 2007-03-28 at 10:54 +0900, Koichi Suzuki wrote: As written below, full page write can be categolized as follows: 1) Needed for crash recovery: first page update after each checkpoint. This has to be kept in WAL. 2) Needed for archive recovery: page update between pg_start_backup

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-27 Thread Koichi Suzuki
Simon; Thanks a lot for your comments/advices. I'd like to write some feedback. Simon Riggs wrote: On Tue, 2007-03-27 at 11:52 +0900, Koichi Suzuki wrote: Here's an update of a code to improve full page writes as proposed in http://archives.postgresql.org/pgsql-hackers/2007-01/msg01491.php