Sorry, because of so many comments/questions, I'll write inline....

Josh Berkus wrote:

Writing lots of additional code simply to remove a parameter that
*might* be mis-interpreted doesn't sound useful to me, especially when
bugs may leak in that way. My take is that this is simple and useful
*and* we have it now; other ways don't yet exist, nor will they in time
for 8.3.

How about naming the parameter wal_compressable? That would indicate pretty clearly that the parameter is intended to be used with wal_compress and nothing else.

Hmm, it sounds nicer.

However, I do agree with Andreas that anything which adds to WAL volume, even 3%, seems like going in the wrong direction. We already have higher log output than any comparable database (higher than InnoDB by 3x) and we should be looking for output to trim as well as compression.

So the relevant question is whether the patch in its current form provides enough benefit to make it worthwhile for 8.3, or whether we should wait for 8.4. Questions:

Before answering questions below, I'd like to say that archive log optimization has to be address different point of views to the current (upto 8.2) settings.

1) To deal with partial/inconsisitent write to the data file at crash recovery, we need full page writes at the first modification to pages after each checkpoint. It consumes much of WAL space.

2) 1) is not necessary for archive recovery (PITR) and full page writes can be removed for this purpose. However, we need full page writes during hot backup to deal with partial writes by backup commands. This is implemented in 8.2.

3) To maintain crash recovery chance and reduce the amount of archive log, removal of unnecessary full page writes from archive logs is a good choice. To do this, we need both logical log and full page writes in WAL.

I don't think there should be only one setting. It depend on how database is operated. Leaving wal_add_optiomization_info = off default does not bring any change in WAL and archive log handling. I understand some people may not be happy with additional 3% or so increase in WAL size, especially people who dosn't need archive log at all. So I prefer to leave the default off.

For users, I think this is simple enough:

1) For people happy with 8.2 settings:
   No change is needed to move to 8.3 and there's really no change.

2) For people who need to reduce archive log size but like to leave full page writes to WAL (to maintain crash recovery chance):
   a) Add GUC parameter: wal_add_optiomization_info=on
   b) Change archive command from "cp" to "pg_compresslog"
   c) Change restore command from "cp" to "pg_decompresslog"

Archive log can be stored and restored as done in older releases.

1) is there any throughput benefit for platforms with fast CPU but contrained I/O (e.g. 2-drive webservers)? Any penalty for servers with plentiful I/O?

I've only run benchmarks with archive process running, because wal_add_optimization_info=on does not make sense if we don't archive WAL. In this situation, total I/O decreases because writes to archive log decreases. Because of 3% or so increase in WAL size, there will be increase in WAL write, but decrease in archive writes makes it up.

2) Will this patch make attempts to reduce WAL volume in the future significantly harder?

Yes, I'd like to continue to work to reduce the WAL size. It's still an issue when database size becomes several handreds of gigabytes in size. Anyway, I think WAL size reduction has to be done in XLogInsert() or XLogWrite(). We need much more discussion for this. The issue will be how to maintain crash recovery chance by inconsistent writes (by full_page_writes=off, we have to give it up). On the other hand we have to keep examining each WAL record.

3) How is this better than command-line compression for log-shipping? e.g. why do we need it in the database?

I don't fully understand what command-line compression means. Simon suggested that this patch can be used with log-shipping and I agree. If we compare compression with gzip or other general purpose compression, compression ratio, CPU usage and I/O by pg_compresslog are all quite better than those in gzip.

Please let me know if you intended defferently.


Koichi Suzuki

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not

Reply via email to