On 1/20/14 9:46 AM, Mel Gorman wrote:
They could potentially be used to evalate any IO scheduler changes. For example -- deadline scheduler with these parameters has X transactions/sec throughput with average latency of Y millieseconds and a maximum fsync latency of Z seconds. Evaluate how well the out-of-box behaviour compares against it with and without some set of patches. At the very least it would be useful for tracking historical kernel performance over time and bisecting any regressions that got introduced. Once we have a test I think many kernel developers (me at least) can run automated bisections once a test case exists.
That's the long term goal. What we used to get out of pgbench were things like >60 second latencies when a checkpoint hit with GBs of dirty memory. That does happen in the real world, but that's not a realistic case you can tune for very well. In fact, tuning for it can easily degrade performance on more realistic workloads.
The main complexity I don't have a clear view of yet is how much unavoidable storage level latency there is in all of the common deployment types. For example, I can take a server with a 256MB battery-backed write cache and set dirty_background_bytes to be smaller than that. So checkpoint spikes go away, right? No. Eventually you will see dirty_background_bytes of data going into an already full 256MB cache. And when that happens, the latency will be based on how long it takes to write the cached 256MB out to the disks. If you have a single disk or RAID-1 pair, that random I/O could easily happen at 5MB/s or less, and that makes for a 51 second cache clearing time. This is a lot better now than it used to be because fsync hasn't flushed the whole cache in many years now. (Only RHEL5 systems still in the field suffer much from that era of code) But you do need to look at the distribution of latency a bit because of how the cache impact things, you can't just consider min/max values.
Take the BBWC out of the equation, and you'll see latency proportional to how long it takes to clear the disk's cache out. It's fun "upgrading" from a disk with 32MB of cache to 64MB only to watch worst case latency double. At least the kernel does the right thing now, using that cache when it can while forcing data out when fsync calls arrive. (That's another important kernel optimization we'll never be able to teach the database)
-- Greg Smith greg.sm...@crunchydatasolutions.com Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers