Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

Gregory Smith Fri, 17 Jan 2014 12:25:50 -0800

On 1/17/14 10:37 AM, Mel Gorman wrote:

There is not an easy way to tell. To be 100%, it would require aninstrumentation patch or a systemtap script to detect when aparticular page is being written back and track the context. There areapproximations though. Monitor nr_dirty pages over time.

I have a benchmarking wrapper for the pgbench testing program calledpgbench-tools: https://github.com/gregs1104/pgbench-tools As ofOctober, on Linux it now plots the "Dirty" value from /proc/meminfo overtime. You get that on the same time axis as the transaction latencydata. The report at the end includes things like the maximum amount ofdirty memory observed during the test sampling. That doesn't tell youexactly what's happening to the level someone reworking the kernel logicmight want, but you can easily see things like the database's checkpointcycle reflected by watching the dirty memory total. This works reallywell for monitoring production servers too. I have a lot of data from aplugin for the Munin monitoring system that plots the same way. Onceyou have some history about what's normal, it's easy to see when systemsfall behind in a way that's ruining writes, and the high water markoften correlates with bad responsiveness periods.

Another recent change is that pgbench for the upcoming PostgreSQL 9.4now allows you to specify a target transaction rate. Seeing the writelatency behavior with that in place is far more interesting thananything we were able to watch with pgbench before. The pgbench writetests we've been doing for years mainly told you the throughput ratewhen all of the caches were always as full as the database could makethem, and tuning for that is not very useful. Turns out it's far moreinteresting to run at 50% of what the storage is capable of, then watchwhat happens to latency when you adjust things like the dirty_* parameters.

I've been working on the problem of how we can make a benchmark testcase that acts enough like real busy PostgreSQL servers that we canshare it with kernel developers, and then everyone has an objective wayto measure changes. These rate limited tests are working much betterfor that than anything I came up with before.

I am skeptical that the database will take over very much of this workand perform better than the Linux kernel does. My take is that our mostuseful role would be providing test cases kernel developers can add to aperformance regression suite. Ugly "we never though that would happen"situations seems at the root of many of the kernel performanceregressions people here get nailed by.

Effective I/O scheduling is very hard, and we are unlikely to ever outinnovate the kernel hacking community by pulling more of that into thedatabase. It's already possible to experiment with moving in thatdirection with tuning changes. Use a larger database shared_buffersvalue, tweak checkpoints to spread I/O out, and reduce things likedirty_ratio. I do some of that, but I've learned it's dangerous towander too far that way.

If instead you let Linux do even more work--give it a lot of memory tomanage and room to re-order I/O--that can work out quite well. Forexample, I've seen a lot of people try to keep latency down by using thedeadline scheduler and very low settings for the expire times. Theoryis great, but it never works out in the real world for me though.Here's the sort of deadline I deploy instead now:


    echo 500      > ${DEV}/queue/iosched/read_expire
    echo 300000   > ${DEV}/queue/iosched/write_expire
    echo 1048576  > ${DEV}/queue/iosched/writes_starved

These numbers look insane compared to the defaults, but I assure youthey're from a server that's happily chugging through 5 to 10Ktransactions/second around the clock. PostgreSQL forces writes out withfsync when they must go out, but this sort of tuning is basically givingup on it managing writes beyond that. We really have no idea what orderthey should go out in. I just let the kernel have a large pile of workqueued up, and trust things like the kernel's block elevator andcongestion code are smarter than the database can possibly be.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

Reply via email to