Re: [HACKERS] Redesigning checkpoint_segments

Greg Smith Thu, 06 Jun 2013 19:44:40 -0700

On 6/6/13 4:41 AM, Heikki Linnakangas wrote:

I was thinking of letting the estimate
decrease like a moving average, but react to any increases immediately.
Same thing we do in bgwriter to track buffer allocations:

Combine what your submitted patch does and this idea, and you'll havesomething I prototyped a few years ago. I took the logic and tested itout in user space by parsing the output from log_checkpoints to see howmany segments were being used. That approach coughed out a value aboutas good for checkpoint_segments as I picked by hand.

The main problem was it liked to over-tune the segments based on a smallbursts of activity, leaving a value higher than you might want to usethe rest of the time. The background writer didn't worry about thisvery much because the cost of making a mistake for one 200ms cycle waspretty low. Setting checkpoint_segments high is a more expensive issue.When I set these by hand, I'll aim more to cover a 99th percentile ofthe maximum segments number rather than every worst case seen.

I don't think that improvement is worth spending very much effort onthough. The moving average approach is more than good enough in mostcases. I've wanted checkpoint_segments to go away in exactly thisfashion for a while.

The general complaint the last time I suggested a change in this area,to make checkpoint_segments larger for the average user, was that somepeople had seen workloads where that was counterproductive. Pretty sureKevin Grittner said he'd seen that happen. That's how I remember thisgeneral idea dying the last time, and I still don't have enough data torefute that doesn't happen.

As far as the UI, if it's a soft limit I'd suggest wal_size_target forthe name. What I would like to see is a single number here in memoryunits that replaces both checkpoint_segments and wal_keep_segments. Ifyou're willing to use a large chunk of disk space to handle either oneof activity spikes or the class of replication issues wal_keep_segmentstargets, I don't see why you'd want to ban using that space for theother one too.

To put some perspective on how far we've been able to push this in thefield with minimal gripes, the repmgr tool requires wal_keep_segments be>=5000, which works out to 78GB. I still see some people use 73GB SASdrives in production servers for their WAL files, but that's the onlytime I've seen that number become scary when deploying repmgr.Meanwhile, the highest value for checkpoint_segments I've set based onreal activity levels was 1024, on a server where checkpoint_timeout is15 minutes (and can be no shorter without checkpoint spikes). At nopoint during that fairly difficult but of tuning work didcheckpoint_segments do anything but get in the way.


--
Greg Smith   2ndQuadrant US    [email protected]   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Redesigning checkpoint_segments

Reply via email to