Re: [PATCHES] Load Distributed Checkpoints, revised patch

Simon Riggs Sun, 17 Jun 2007 01:44:06 -0700

On Sun, 2007-06-17 at 08:51 +0100, Heikki Linnakangas wrote:

> > We don't really care about units because
> > the way you use it is to nudge it up a little and see if that works
> > etc..
> 
> Not necessarily. If it's given in KB/s, you might very well have an idea 
> of how much I/O your hardware is capable of, and set aside a fraction of 
> that for checkpoints.


I'm worried that people will think they can calculate the setting
without testing.

I guess with the right caveats in the docs about the need for testing to
ensure the values are suitable for your situation, I can accept KB/s. 

> > Can we avoid having another parameter? There must be some protection in
> > there to check that a checkpoint lasts for no longer than
> > checkpoint_timeout, so it makes most sense to vary the checkpoint in
> > relation to that parameter.
> 
> Sure, that's what checkpoint_write_percent is for. 

Yeh, I didn't understand the name.

> checkpoint_rate can 
> be used to finish the checkpoint faster, if there's not much work to do. 
> For example, if there's only 10 pages to flush in a checkpoint, 
> checkpoint_timeout is 30 minutes and checkpoint_write_percent = 50%, you 
> don't want to spread out those 10 writes over 15 minutes, that would be 
> just silly. checkpoint_rate sets the *minimum* rate used to write. If 
> writing at that minimum rate isn't enough to finish the checkpoint in 
> time, as defined by by checkpoint interval * checkpoint_write_percent, 
> we write more aggressively.
> 
> I'm more interested in checkpoint_write_percent myself as well, but Greg 
> Smith said he wanted the checkpoint to use a constant I/O rate and let 
> the length of the checkpoint to vary.

Having both parameters is good.

I'm really impressed with the results on the response time graphs.

> >> - The signaling between RequestCheckpoint and bgwriter is a bit tricky. 
> >> Bgwriter now needs to deal immediate checkpoint requests, like those 
> >> coming from explicit CHECKPOINT or CREATE DATABASE commands, differently 
> >> from those triggered by checkpoint_segments. I'm afraid there might be 
> >> race conditions when a CHECKPOINT is issued at the same instant as 
> >> checkpoint_segments triggers one. What might happen then is that the 
> >> checkpoint is performed lazily, spreading the writes, and the CHECKPOINT 
> >> command has to wait for that to finish which might take a long time. I 
> >> have not been able to convince myself neither that the race condition 
> >> exists or that it doesn't.
> > 
> > Is there a mechanism for requesting immediate/non-immediate checkpoints?
> 
> No, CHECKPOINT requests an immediate one. Is there a use case for 
> CHECKPOINT LAZY?

I meant via the CreateCheckpoint API etc.

> > pg_start_backup() should be a normal checkpoint I think. No need for
> > backup to be an intrusive process.
> 
> Good point. A spread out checkpoint can take a long time to finish, 
> though. Is there risk for running into a timeout or something if it 
> takes say 10 minutes for a call to pg_start_backup to finish?

That would be annoying, but the alternative is for backups to seriously
effect performance, which would defeat the object of the HOT backup.
It's not like its immediate right now, so we'd probably be moving from
2-3 mins to 10 mins in your example. Most people are expecting their
backups to take a long time anyway, so thats OK. 

> > Do we need to know it so exactly that we look
> > at WALInsertLock? Maybe use info_lck to request the latest page, since
> > that is less heavily contended and we need never wait across I/O.
> 
> Is there such a value available, that's protected by just info_lck? I 
> can't see one.

XLogCtl->LogwrtRqst.Write

-- 
  Simon Riggs             
  EnterpriseDB   http://www.enterprisedb.com



---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org

Re: [PATCHES] Load Distributed Checkpoints, revised patch

Reply via email to