Re: [HACKERS] Linux filesystem performance and checkpoint sorting

2011-02-04 Thread Greg Smith

Josh Berkus wrote:

So: Linux flavor?  Kernel version?  Disk system and PG directory layout?
  


OS configuration and PostgreSQL settings are saved into the output from 
the later runs (I added that somewhere in the middle):


http://www.2ndquadrant.us/pgbench-results/294/pg_settings.txt

That's Ubuntu 10.04, kernel 2.6.32. 

There is a test rig bug that queries the wrong PostgreSQL settings in 
the later ones, but they didn't change after #294 here.  The kernel 
configuration stuff is accurate through, which confirms exactly what 
settings for the dirty_* parameters was effective for each during the 
tests I was changing those around.


16GB of RAM, 8 Hyperthreaded cores (4 real ones) via Intel i7-870.  
Areca ARC-1210 controller, 256MB of cache.


Filesystem   1K-blocks  Used Available Use% Mounted on
/dev/sda1  40G  7.5G   30G  20% /
/dev/md1  838G   15G  824G   2% /stripe
/dev/sdd1 149G  2.1G  147G   2% /xlog

/stripe is a 3 disk RAID0, setup to only use the first section of the 
drive ("short-stroked").  That makes its performance a little more like 
a small SAS disk, rather than the cheapo 7200RPM SATA drives they 
actually are (Western Digital 640GB WD6400AAKS-65A7B).  /xlog is a 
single disk, 160GB WD1600AAJS-00WAA.  OS, server logs, and test results 
information all go to the root filesystem on a different drive.  My aim 
was to get similar performance to what someone with an 8-disk RAID10 
array might see, except without the redundancy.  Basic entry-level 
database server here in 2011.


bonnie++ on the main database disk:  read 301MB/s write 215MB/s, seeks 
423.4/second.  Measured around 10K small commits/second to prove the 
battery-backed write cache works fine.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Linux filesystem performance and checkpoint sorting

2011-02-04 Thread Greg Smith

Mark Kirkwood wrote:
Are you going to do some runs with ext4? I'd be very interested to see 
how it compares (assuming that you are on a kernel version 2.6.32 or 
later so ext4 is reasonably stable...).


Yes, before I touch this system significantly I'll do ext4 as well, and 
this is running the Ubuntu 10.04 2.6.32 kernel so ext4 should be stable 
enough.  I have some PostgreSQL work that needs to get finished first 
though.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Linux filesystem performance and checkpoint sorting

2011-02-04 Thread Stephen J. Butler
On Fri, Feb 4, 2011 at 12:31 PM, Greg Smith  wrote:
> -Switching from ext3 to xfs gave over a 3X speedup on the smaller test set:
>  from the 600-700 TPS range to around 2200 TPS.  TPS rate on the larger data
> set actually slowed down a touch on XFS, around 10%.  Still, such a huge win
> when it's better makes it easy to excuse the occasional cases where it's a
> bit slower.

Did you see that they improved XFS scalability in 2.6.37?

http://kernelnewbies.org/Linux_2_6_37#head-dfa29df2b21f5a72fb17f041a7356deeea3d159e

Looks like there's more XFS improvements in store for 2.6.38.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Linux filesystem performance and checkpoint sorting

2011-02-04 Thread Mark Kirkwood

On 05/02/11 07:31, Greg Smith wrote:
Switching to a new thread for this summary since there's some much 
more generic info here...at this point I've finished exploring the 
major Linux filesystem and tuning options I wanted to, as part of 
examining changes to the checkpoint code.  You can find all the raw 
data at http://www.2ndquadrant.us/pgbench-results/index.htm


Awesome! Very useful results.

Are you going to do some runs with ext4? I'd be very interested to see 
how it compares (assuming that you are on a kernel version 2.6.32 or 
later so ext4 is reasonably stable...).


Cheers

Mark


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Linux filesystem performance and checkpoint sorting

2011-02-04 Thread Josh Berkus
Greg,

Thanks for doing these tests!

So: Linux flavor?  Kernel version?  Disk system and PG directory layout?


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Linux filesystem performance and checkpoint sorting

2011-02-04 Thread Greg Smith
Switching to a new thread for this summary since there's some much more 
generic info here...at this point I've finished exploring the major 
Linux filesystem and tuning options I wanted to, as part of examining 
changes to the checkpoint code.  You can find all the raw data at 
http://www.2ndquadrant.us/pgbench-results/index.htm  Here are some 
highlights of what's been demonstrated there recently, with a summary of 
some of the more subtle and interesting data in the attached CSV file too:


-On ext3, tuning the newish kernel tunables dirty_bytes and 
dirty_background_bytes down to a lower level than was possible using the 
older dirty_*ratio ones shows a significant reduction in maximum latency 
on ext3; it drops to about 1/4 of the worst-case behavior.  
Unfortunately transactions per second takes a 10-15% hit in the 
process.  Not shown in the data there is that the VACUUM cleanup time 
between tests was really slowed down, too, running at around half the 
speed of when the system has a full-size write cache.


-Switching from ext3 to xfs gave over a 3X speedup on the smaller test 
set:  from the 600-700 TPS range to around 2200 TPS.  TPS rate on the 
larger data set actually slowed down a touch on XFS, around 10%.  Still, 
such a huge win when it's better makes it easy to excuse the occasional 
cases where it's a bit slower.  And the latency situation is just wildly 
better, the main thing that drove me toward using XFS more in the first 
place.  Anywhere from 1/6 to 1/25 of the worst-case latency seen on 
ext3.  With abusively high client counts for this hardware, you can 
still see >10 second pauses, but you don't see >40 second ones at 
moderate client counts like ext3 experiences.


-Switching to the lower possible dirty_*bytes parameters on XFS was 
negative in every way.  TPS was cut in half, and maximum latency 
actually went up.  Between this and the nasty VACUUM slowdown, I don't 
really see that much potential for these new tunables.  They do lower 
latency on ext3 a lot, but even there the penalty you pay for that is 
quite high.  VACUUM in particular seems to really, really benefit from 
having a giant write cache to dump its work into--possibly due to the 
way the ring buffer implementation avoids using the database's own cache 
for that work.


-Since earlier tests suggested sorting checkpoints gave little change on 
ext3, I started testing that with XFS instead.  The result is a bit 
messy.  At the lower scale, TPS went up a bit, but so did maximum 
latency.  At the higher scale, TPS dropped in some cases (typically less 
than 1%), but most latency results were better too.


At this point I would say checkpoint sorting remains a wash:  you can 
find workloads it benefits a little, and others it penalizes a little.  
I would say that it's neutral enough on average that if it makes sense 
to include for other purposes, that's unlikely to be a really bad change 
for anyone.  But I wouldn't want to see it committed by itself; there 
needs to be some additional benefit from the sorting before it's really 
worthwhile.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

"Compact fsync",,"ext3",,,"XFS + Regular Writes",,"Sorted Writes"
"scale","clients","tps","max_latency","XFS Speedup","tps","max_latency","tps","max_latency","TPS Delta","%","Latency Delta"
500,16,631,17116.31,3.49,2201,1290.73,2210,2070.74,9,0.41%,780.01
500,32,655,24311.54,3.37,2205,1379.14,2357,1971.2,152,6.89%,592.06
500,64,727,38040.39,3.11,2263,1440.48,2332,1763.29,69,3.05%,322.81
500,128,687,48195.77,3.2,2201,1743.11,2221,2742.18,20,0.91%,999.07
500,256,747,46799.48,2.92,2184,2429.74,2171,2356.14,-13,-0.60%,-73.6
1000,16,321,40826.58,1.21,389,1586.17,386,1598.54,-3,-0.77%,12.37
1000,32,345,27910.51,0.91,314,2150.94,331,2078.02,17,5.41%,-72.91
1000,64,358,45138.1,0.94,336,6681.57,320,6469.71,-16,-4.76%,-211.87
1000,128,372,47125.46,0.88,328,8707.42,330,9037.63,2,0.61%,330.21
1000,256,350,83232.14,0.91,317,11973.35,315,11248.18,-2,-0.63%,-725.17

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers