>> That would be great.  What I've been using as a test case is pgbench
>> -S -c $NUM_CPU_CORES -j $NUM_CPU_CORES with scale factor 100 and
>> shared_buffers=8GB.
>> 
>> I think what you'd want to compare is the performance of unpatched
>> master, vs. the performance with this line added to s_lock.h for your
>> architecture:
>> 
>> #define TAS_SPIN(lock)  (*(lock) ? 1 : TAS(lock))
>> 
>> We've now added that line for ia64 (the line is present in two
>> different places in the file, one for GCC and the other for HP's
>> compiler).  So the question is whether we need it for any other
>> architectures.
> 
> Ok. Let me talk to IBM guys...

With help from IBM Japan Ltd. we did some tests on a larger IBM
machine than Tom Lane has used for his
test(http://archives.postgresql.org/message-id/8292.1314641...@sss.pgh.pa.us).
In his case it was IBM 8406-71Y, which has 8 physical cores and
4SMT(32 threadings). Ours is IBM Power 750 Express, which has 32
physical cores and 4SMT(128 threadings), 256GB RAM.

The test method was same as the one in the article above. The
differences are OS(RHEL 6.1), gcc version (4.4.5) and shared buffer
size(8GB).

We tested 3 methods to enhance spin lock contention:

1) Add "hint" parameter to lwarx op which is usable POWER6 or later
   architecure.

2) Add non-locked test in TAS()

3) #1 + #2

We saw small performance enhancement with #1, larger one with #2 and
even better with #1+#2.

Stock git head:

pgbench -c 1 -j 1 -S -T 300 bench       tps = 10356.306513 (including ...
pgbench -c 2 -j 1 -S -T 300 bench       tps = 21841.10285 (including ...
pgbench -c 8 -j 4 -S -T 300 bench       tps = 63800.868529 (including ...
pgbench -c 16 -j 8 -S -T 300 bench      tps = 144872.64726 (including ...
pgbench -c 32 -j 16 -S -T 300 bench     tps = 120943.238461 (including ...
pgbench -c 64 -j 32 -S -T 300 bench     tps = 108144.933981 (including ...
pgbench -c 128 -j 64 -S -T 300 bench    tps = 92202.782791 (including ...

With hint (method #1):

pgbench -c 1 -j 1 -S -T 300 bench       tps = 11198.1872 (including ...
pgbench -c 2 -j 1 -S -T 300 bench       tps = 21390.592014 (including ...
pgbench -c 8 -j 4 -S -T 300 bench       tps = 74423.488089 (including ...
pgbench -c 16 -j 8 -S -T 300 bench      tps = 153766.351105 (including ...
pgbench -c 32 -j 16 -S -T 300 bench     tps = 134313.758113 (including ...
pgbench -c 64 -j 32 -S -T 300 bench     tps = 129392.154047 (including ...
pgbench -c 128 -j 64 -S -T 300 bench    tps = 105506.948058 (including ...

Non-locked test in TAS() (method #2):

pgbench -c 1 -j 1 -S -T 300 bench       tps = 10537.893154 (including ...
pgbench -c 2 -j 1 -S -T 300 bench       tps = 22019.388666 (including ...
pgbench -c 8 -j 4 -S -T 300 bench       tps = 78763.930379 (including ...
pgbench -c 16 -j 8 -S -T 300 bench      tps = 142791.99724 (including ...
pgbench -c 32 -j 16 -S -T 300 bench     tps = 222008.903675 (including ...
pgbench -c 64 -j 32 -S -T 300 bench     tps = 209912.691058 (including ...
pgbench -c 128 -j 64 -S -T 300 bench    tps = 199437.23965 (including ...

With hint and non-locked test in TAS (#1 + #2)

pgbench -c 1 -j 1 -S -T 300 bench       tps = 11419.881375 (including ...
pgbench -c 2 -j 1 -S -T 300 bench       tps = 21919.530209 (including ...
pgbench -c 8 -j 4 -S -T 300 bench       tps = 74788.242876 (including ...
pgbench -c 16 -j 8 -S -T 300 bench      tps = 156354.988564 (including ...
pgbench -c 32 -j 16 -S -T 300 bench     tps = 240521.495 (including ...
pgbench -c 64 -j 32 -S -T 300 bench     tps = 235709.272642 (including ...
pgbench -c 128 -j 64 -S -T 300 bench    tps = 220135.729663 (including ...

Since each core usage is around 50% in the benchmark, there is room for 
further performance improvement by eliminating other contentions, tuning 
compiler option etc.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to