Re: [HACKERS] Incremental checkopints

Greg Smith Fri, 29 Jul 2011 12:03:35 -0700

On 07/29/2011 11:04 AM, [email protected] wrote:

I think that current implementation of checkpoints is not good for huge
shared buffer cache and for many WAL segments. If there is more buffers
and if buffers can be written rarely more updates of buffers can be
combined so total number of writes to disk will be significantly less. I
think that incremental checkpoints can achieve this goal (maybe more) and
price is additional memory (about 1/1000 of size of buffer cache).

The current code optimizes for buffers that are written frequently.Those will sit in shared_buffers and in the hoped for case, only bewritten once at checkpoint time.

There are two issues with adopting increment checkpoints instead, onefundamental, the other solvable but not started on yet:

1) Postponing writes as long as possible always improves the resultingthroughput of those writes. Any incremental checkpoint approach willdetune throughput by some amount. If you make writes go out more often,they will be less efficient; that's just how things work if youbenchmark anything that allows write combining. Any incrementalcheckpoint approach is likely to improve latency in some cases if itworks well, while decreasing throughput in most cases.

2) The incremental checkpoint approach used by other databases, such asthe MySQL implementation, works by tracking what transaction IDs wereassociated with a buffer update. The current way PostgreSQL savesbuffer sync information for the checkpoint to process things doesn'tstore enough information to do that. As you say, the main price thereis some additional memory.

From my perspective, the main problem with plans to tweak thecheckpoint code is that we don't have a really good benchmark thattracks both throughput and latency to test proposed changes against.Mark Wong has been working to get his TCP-E clone DBT-5 runningregularly for that purpose, and last I heard that was basically done atthis point--he's running daily tests now. There's already a small pileof patches that adjust checkpoint behavior around that were postponedfrom being included in 9.1 mainly because it was hard to prove they wereuseful given the benchmark used to test them, pgbench. I have higherhopes for DBT-5 as being a test that gives informative data in thisarea. I would want to go back and revisit the existing patches (sortedcheckpoints, spread sync) before launching into this whole new area. Idon't think any of those has even been proven not to work, they justdidn't help the slightly unrealistic pgbench write-heavy workload.


--
Greg Smith   2ndQuadrant US    [email protected]   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incremental checkopints

Reply via email to