Re: [HACKERS] Load Distributed Checkpoints test results

Greg Smith Mon, 18 Jun 2007 06:42:04 -0700

On Mon, 18 Jun 2007, Simon Riggs wrote:

Smoother checkpoints mean smaller resource queues when a burst coincideswith a checkpoint, so anybody with throughput-maximised or bursty appsshould want longer, smooth checkpoints.


True as long as two conditions hold:

1) Buffers needed to fill allocation requests are still being written fastenough. The buffer allocation code starts burning a lot of CPU+lockresources when many clients are all searching the pool looking for abuffers and there aren't many clean ones to be found. The way the currentcheckpoint code starts at the LRU point and writes everything dirty in theorder new buffers will be allocating in as fast as possible means it'sdoing the optimal procedure to keep this from happening. It's beingpresumed that making the LRU writer active will mitigate this issue, myexperience suggests that may not be as effective as hoped--unless it getschanged so that it's allowed to decrement usage_count.

To pick one example of a direction I'm a little concerned about related tothis, Itagaki's sorted writes results look very interesting. But as histest system is such that the actual pgbench TPS numbers are 1/10 of theones I was seeing when I started having ugly buffer allocation issues, I'mreal sure the particular test he's running isn't sensitive to issues inthis area at all; there's just not enough buffer cache churn if you'reonly doing a couple of hundred TPS for this to happen.


2) The checkpoint still finishes in time.

The thing you can't forget about when dealing with an overloaded system isthat there's no such thing as lowering the load of the checkpoint suchthat it doesn't have a bad impact. Assume new transactions are beinggenerated by an upstream source such that the database itself is thebottleneck, and you're always filling 100% of I/O capacity. All I'mtrying to get everyone to consider is that if you have a large pool ofdirty buffers to deal with in this situation, it's possible (albeitdifficult) to get into a situation where if the checkpoint doesn't writeout the dirty buffers fast enough, the client backends will evacuate theminstead in a way that makes the whole process less efficient than thecurrent behavior.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org

Re: [HACKERS] Load Distributed Checkpoints test results

Reply via email to