Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

Gregory Smith Thu, 23 Jan 2014 13:12:21 -0800

On 1/20/14 9:46 AM, Mel Gorman wrote:

They could potentially be used to evalate any IO scheduler changes.For example -- deadline scheduler with these parameters has Xtransactions/sec throughput with average latency of Y milliesecondsand a maximum fsync latency of Z seconds. Evaluate how well theout-of-box behaviour compares against it with and without some set ofpatches. At the very least it would be useful for tracking historicalkernel performance over time and bisecting any regressions that gotintroduced. Once we have a test I think many kernel developers (me atleast) can run automated bisections once a test case exists.

That's the long term goal. What we used to get out of pgbench werethings like >60 second latencies when a checkpoint hit with GBs of dirtymemory. That does happen in the real world, but that's not a realisticcase you can tune for very well. In fact, tuning for it can easilydegrade performance on more realistic workloads.

The main complexity I don't have a clear view of yet is how muchunavoidable storage level latency there is in all of the commondeployment types. For example, I can take a server with a 256MBbattery-backed write cache and set dirty_background_bytes to be smallerthan that. So checkpoint spikes go away, right? No. Eventually youwill see dirty_background_bytes of data going into an already full 256MBcache. And when that happens, the latency will be based on how long ittakes to write the cached 256MB out to the disks. If you have a singledisk or RAID-1 pair, that random I/O could easily happen at 5MB/s orless, and that makes for a 51 second cache clearing time. This is a lotbetter now than it used to be because fsync hasn't flushed the wholecache in many years now. (Only RHEL5 systems still in the field suffermuch from that era of code) But you do need to look at the distributionof latency a bit because of how the cache impact things, you can't justconsider min/max values.

Take the BBWC out of the equation, and you'll see latency proportionalto how long it takes to clear the disk's cache out. It's fun "upgrading"from a disk with 32MB of cache to 64MB only to watch worst case latencydouble. At least the kernel does the right thing now, using that cachewhen it can while forcing data out when fsync calls arrive. (That'sanother important kernel optimization we'll never be able to teach thedatabase)


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

Reply via email to