Re: [HACKERS] Controlling Load Distributed Checkpoints

Heikki Linnakangas Fri, 08 Jun 2007 01:54:55 -0700

Greg Smith wrote:

On Thu, 7 Jun 2007, Heikki Linnakangas wrote:
So there's two extreme ways you can use LDC:
1. Finish the checkpoint as soon as possible, without disturbing otheractivity too much2. Disturb other activity as little as possible, as long as thecheckpoint finishes in a reasonable time.Are both interesting use cases, or is it enough to cater for just oneof them? I think 2 is easier to tune.
The motivation for the (1) case is that you've got a system that'sdirtying the buffer cache very fast in normal use, where even thebackground writer is hard pressed to keep the buffer pool clean. Thecheckpoint is the most powerful and efficient way to clean up many dirtybuffers out of such a buffer cache in a short period of time so thatyou're back to having room to work in again. In that situation, sincethere are many buffers to write out, you'll also be suffering greatlyfrom fsync pauses. Being able to synchronize writes a little betterwith the underlying OS to smooth those out is a huge help.

ISTM the bgwriter just isn't working hard enough in that scenario.Assuming we get the lru autotuning patch in 8.3, do you think there'sstill merit in using the checkpoints that way?

I'm completely biased because of the workloads I've been dealing withrecently, but I consider (2) so much easier to tune for that it's barelyworth worrying about. If your system is so underloaded that you can letthe checkpoints take their own sweet time, I'd ask if you have enoughgoing on that you're suffering very much from checkpoint performanceissues anyway. I'm used to being in a situation where if you don't pushout checkpoint data as fast as physically possible, you end up fightingwith the client backends for write bandwidth once the LRU point movespast where the checkpoint has written out to already. I'm not sure howmuch always running the LRU background writer will improve that situation.

I'd think it eliminates the problem. Assuming we keep the LRU cleaningrunning as usual, I don't see how writing faster during checkpointscould ever be beneficial for concurrent activity. The more you write,the less bandwidth there's available for others.

Doing the checkpoint as quickly as possible might be slightly better foraverage throughput, but that's a different matter.

On every system I've ever played with Postgres write performance on, Idiscovered that the memory-based parameters like dirty_background_ratiowere really driving write behavior, and I almost ignore the expiretimeout now. Plotting the "Dirty:" value in /proc/meminfo as you'rerunning tests is extremely informative for figuring out what Linux isreally doing underneath the database writes.

Interesting. I haven't touched any of the kernel parameters yet in mytests. It seems we need to try different parameters and see how thedynamics change. But we must also keep in mind that average DBA doesn'tchange any settings, and might not even be able or allowed to. Thatmeans the defaults should work reasonably well without tweaking the OSsettings.

The influence of the congestion code is why I made the comment aboutwatching how long writes are taking to gauge how fast you can dump dataonto the disks. When you're suffering from one of the congestionmechanisms, the initial writes start blocking, even before the fsync.That behavior is almost undocumented outside of the relevant kernelsource code.

Yeah, that's controlled by dirty_ratio, if I've understood theparameters correctly. If we spread out the writes enough, we shouldn'thit that limit or congestion. That's the point of the patch.

Do you have time / resources to do testing? You've clearly spent a lotof time on this, and I'd be very interested to see some actual numbersfrom your tests with various settings.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq

Re: [HACKERS] Controlling Load Distributed Checkpoints

Reply via email to