On 10/05/2016 10:03 AM, Amit Kapila wrote:
On Wed, Oct 5, 2016 at 12:05 PM, Tomas Vondra
<tomas.von...@2ndquadrant.com> wrote:
Hi,

After collecting a lot more results from multiple kernel versions, I can
confirm that I see a significant improvement with 128 and 192 clients,
roughly by 30%:

                           64        128        192
    ------------------------------------------------
     master             62482      43181      50985
     granular-locking   61701      59611      47483
     no-content-lock    62650      59819      47895
     group-update       63702      64758      62596

But I only see this with Dilip's workload, and only with pre-4.3.0 kernels
(the results above are from kernel 3.19).


That appears positive.


I got access to a large machine with 72/144 cores (thanks to Oleg and Alexander from Postgres Professional), and I'm running the tests on that machine too.

Results from Dilip's workload (with scale 300, unlogged tables) look like this:

                        32      64    128     192    224     256    288
  master            104943  128579  72167  100967  66631   97088  63767
  granular-locking  103415  141689  83780  120480  71847  115201  67240
  group-update      105343  144322  92229  130149  81247  126629  76638
  no-content-lock   103153  140568  80101  119185  70004  115386  66199

So there's some 20-30% improvement for >= 128 clients.

But what I find much more intriguing is the zig-zag behavior. I mean, 64 clients give ~130k tps, 128 clients only give ~70k but 192 clients jump up to >100k tps again, etc.

FWIW I don't see any such behavior on pgbench, and all those tests were done on the same cluster.

With 4.5.5, results for the same benchmark look like this:

                           64        128        192
    ------------------------------------------------
     master             35693      39822      42151
     granular-locking   35370      39409      41353
     no-content-lock    36201      39848      42407
     group-update       35697      39893      42667

That seems like a fairly bad regression in kernel, although I have not
identified the feature/commit causing it (and it's also possible the issue
lies somewhere else, of course).

With regular pgbench, I see no improvement on any kernel version. For
example on 3.19 the results look like this:

                           64        128        192
    ------------------------------------------------
     master             54661      61014      59484
     granular-locking   55904      62481      60711
     no-content-lock    56182      62442      61234
     group-update       55019      61587      60485


Are the above results with synchronous_commit=off?


No, but I can do that.

I haven't done much more testing (e.g. with -N to eliminate
collisions on branches) yet, let's see if it changes anything.


Yeah, let us see how it behaves with -N. Also, I think we could try
at higher scale factor?


Yes, I plan to do that. In total, I plan to test combinations of:

(a) Dilip's workload and pgbench (regular and -N)
(b) logged and unlogged tables
(c) scale 300 and scale 3000 (both fits into RAM)
(d) sync_commit=on/off

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to