On Mon, 18 Jun 2007, Simon Riggs wrote:

Smoother checkpoints mean smaller resource queues when a burst coincides with a checkpoint, so anybody with throughput-maximised or bursty apps should want longer, smooth checkpoints.

True as long as two conditions hold:

1) Buffers needed to fill allocation requests are still being written fast enough. The buffer allocation code starts burning a lot of CPU+lock resources when many clients are all searching the pool looking for a buffers and there aren't many clean ones to be found. The way the current checkpoint code starts at the LRU point and writes everything dirty in the order new buffers will be allocating in as fast as possible means it's doing the optimal procedure to keep this from happening. It's being presumed that making the LRU writer active will mitigate this issue, my experience suggests that may not be as effective as hoped--unless it gets changed so that it's allowed to decrement usage_count.

To pick one example of a direction I'm a little concerned about related to this, Itagaki's sorted writes results look very interesting. But as his test system is such that the actual pgbench TPS numbers are 1/10 of the ones I was seeing when I started having ugly buffer allocation issues, I'm real sure the particular test he's running isn't sensitive to issues in this area at all; there's just not enough buffer cache churn if you're only doing a couple of hundred TPS for this to happen.

2) The checkpoint still finishes in time.

The thing you can't forget about when dealing with an overloaded system is that there's no such thing as lowering the load of the checkpoint such that it doesn't have a bad impact. Assume new transactions are being generated by an upstream source such that the database itself is the bottleneck, and you're always filling 100% of I/O capacity. All I'm trying to get everyone to consider is that if you have a large pool of dirty buffers to deal with in this situation, it's possible (albeit difficult) to get into a situation where if the checkpoint doesn't write out the dirty buffers fast enough, the client backends will evacuate them instead in a way that makes the whole process less efficient than the current behavior.

* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?


Reply via email to