On Thu, Oct 20, 2016 at 9:03 PM, Tomas Vondra
<tomas.von...@2ndquadrant.com> wrote:

> In the results you've posted on 10/12, you've mentioned a regression with 32
> clients, where you got 52k tps on master but only 48k tps with the patch (so
> ~10% difference). I have no idea what scale was used for those tests,

That test was with scale factor 300 on POWER 4 socket machine. I think
I need to repeat this test with multiple reading to confirm it was
regression or run to run variation. I will do that soon and post the
results.

> and I
> see no such regression in the current results (but you only report results
> for some of the client counts).

This test is on X86 8 socket machine, At 1000 scale factor I have
given reading with all client counts (32,64,96,192), but at 300 scale
factor I posted only with 192 because on this machine (X86 8 socket
machine) I did not see much load on ClogControlLock at 300 scale
factor.
>
> Also, which of the proposed patches have you been testing?
I tested with GroupLock patch.

> Can you collect and share a more complete set of data, perhaps based on the
> scripts I use to do tests on the large machine with 36/72 cores, available
> at https://bitbucket.org/tvondra/hp05-results ?

I think from my last run I did not share data for -> X86 8 socket
machine, 300 scale factor, 32,64,96 client. I already have those data
so I ma sharing it. (Please let me know if you want to see at some
other client count, for that I need to run another test.)

Head:
scaling factor: 300
query mode: prepared
number of clients: 32
number of threads: 32
duration: 1800 s
number of transactions actually processed: 77233356
latency average: 0.746 ms
tps = 42907.363243 (including connections establishing)
tps = 42907.546190 (excluding connections establishing)
[dilip.kumar@cthulhu bin]$ cat 300_32_ul.txt
 111757                  |
   3666
   1289  LWLockNamed     | ProcArrayLock
   1142  Lock            | transactionid
    318  LWLockNamed     | CLogControlLock
    299  Lock            | extend
    109  LWLockNamed     | XidGenLock
     70  LWLockTranche   | buffer_content
     35  Lock            | tuple
     29  LWLockTranche   | lock_manager
     14  LWLockTranche   | wal_insert
      1 Tuples only is on.
      1  LWLockNamed     | CheckpointerCommLock

Group Lock Patch:

scaling factor: 300
query mode: prepared
number of clients: 32
number of threads: 32
duration: 1800 s
number of transactions actually processed: 77544028
latency average: 0.743 ms
tps = 43079.783906 (including connections establishing)
tps = 43079.960331 (excluding connections establishing
 112209                  |
   3718
   1402  LWLockNamed     | ProcArrayLock
   1070  Lock            | transactionid
    245  LWLockNamed     | CLogControlLock
    188  Lock            | extend
     80  LWLockNamed     | XidGenLock
     76  LWLockTranche   | buffer_content
     39  LWLockTranche   | lock_manager
     31  Lock            | tuple
      7  LWLockTranche   | wal_insert
      1 Tuples only is on.
      1  LWLockTranche   | buffer_mapping

Head:
number of clients: 64
number of threads: 64
duration: 1800 s
number of transactions actually processed: 76211698
latency average: 1.512 ms
tps = 42339.731054 (including connections establishing)
tps = 42339.930464 (excluding connections establishing)
[dilip.kumar@cthulhu bin]$ cat 300_64_ul.txt
 215734                  |
   5106  Lock            | transactionid
   3754  LWLockNamed     | ProcArrayLock
   3669
   3267  LWLockNamed     | CLogControlLock
    661  Lock            | extend
    339  LWLockNamed     | XidGenLock
    310  Lock            | tuple
    289  LWLockTranche   | buffer_content
    205  LWLockTranche   | lock_manager
     50  LWLockTranche   | wal_insert
      2  LWLockTranche   | buffer_mapping
      1 Tuples only is on.
      1  LWLockTranche   | proc

GroupLock patch:
scaling factor: 300
query mode: prepared
number of clients: 64
number of threads: 64
duration: 1800 s
number of transactions actually processed: 76629309
latency average: 1.503 ms
tps = 42571.704635 (including connections establishing)
tps = 42571.905157 (excluding connections establishing)
[dilip.kumar@cthulhu bin]$ cat 300_64_ul.txt
 217840                  |
   5197  Lock            | transactionid
   3744  LWLockNamed     | ProcArrayLock
   3663
    966  Lock            | extend
    849  LWLockNamed     | CLogControlLock
    372  Lock            | tuple
    305  LWLockNamed     | XidGenLock
    199  LWLockTranche   | buffer_content
    184  LWLockTranche   | lock_manager
     35  LWLockTranche   | wal_insert
      1 Tuples only is on.
      1  LWLockTranche   | proc
      1  LWLockTranche   | buffer_mapping

Head:
scaling factor: 300
query mode: prepared
number of clients: 96
number of threads: 96
duration: 1800 s
number of transactions actually processed: 77663593
latency average: 2.225 ms
tps = 43145.624864 (including connections establishing)
tps = 43145.838167 (excluding connections establishing)

 302317                  |
  18836  Lock            | transactionid
  12912  LWLockNamed     | CLogControlLock
   4120  LWLockNamed     | ProcArrayLock
   3662
   1700  Lock            | tuple
   1305  Lock            | extend
   1030  LWLockTranche   | buffer_content
    828  LWLockTranche   | lock_manager
    730  LWLockNamed     | XidGenLock
    107  LWLockTranche   | wal_insert
      4  LWLockTranche   | buffer_mapping
      1 Tuples only is on.
      1  LWLockTranche   | proc
      1  BufferPin       | BufferPin

Group Lock Patch:
scaling factor: 300
query mode: prepared
number of clients: 96
number of threads: 96
duration: 1800 s
number of transactions actually processed: 61608756
latency average: 2.805 ms
tps = 44385.885080 (including connections establishing)
tps = 44386.297364 (excluding connections establishing)
[dilip.kumar@cthulhu bin]$ cat 300_96_ul.txt
 237842                  |
  14379  Lock            | transactionid
   3335  LWLockNamed     | ProcArrayLock
   2850
   1374  LWLockNamed     | CLogControlLock
   1200  Lock            | tuple
    992  Lock            | extend
    717  LWLockNamed     | XidGenLock
    625  LWLockTranche   | lock_manager
    259  LWLockTranche   | buffer_content
    105  LWLockTranche   | wal_insert
      4  LWLockTranche   | buffer_mapping
      2  LWLockTranche   | proc

Head:
scaling factor: 300
query mode: prepared
number of clients: 192
number of threads: 192
duration: 1800 s
number of transactions actually processed: 65930726
latency average: 5.242 ms
tps = 36621.827041 (including connections establishing)
tps = 36622.064081 (excluding connections establishing)
[dilip.kumar@cthulhu bin]$ cat 300_192_ul.txt
 437848                  |
 118966  Lock            | transactionid
  88869  LWLockNamed     | CLogControlLock
  18558  Lock            | tuple
   6183  LWLockTranche   | buffer_content
   5664  LWLockTranche   | lock_manager
   3995  LWLockNamed     | ProcArrayLock
   3646
   1748  Lock            | extend
   1635  LWLockNamed     | XidGenLock
    401  LWLockTranche   | wal_insert
     33  BufferPin       | BufferPin
      5  LWLockTranche   | proc
      3  LWLockTranche   | buffer_mapping

GroupLock Patch:
scaling factor: 300
query mode: prepared
number of clients: 192
number of threads: 192
duration: 1800 s
number of transactions actually processed: 82616270
latency average: 4.183 ms
tps = 45894.737813 (including connections establishing)
tps = 45894.995634 (excluding connections establishing)
120372  Lock            | transactionid
  16346  Lock            | tuple
   7489  LWLockTranche   | lock_manager
   4514  LWLockNamed     | ProcArrayLock
   3632
   3310  LWLockNamed     | CLogControlLock
   2287  LWLockNamed     | XidGenLock
   2271  Lock            | extend
    709  LWLockTranche   | buffer_content
    490  LWLockTranche   | wal_insert
     30  BufferPin       | BufferPin
     10  LWLockTranche   | proc
      6  LWLockTranche   | buffer_mapping

Summary: On (X86 8 Socket machine, 300 S.F), I did not observe
significant wait on ClogControlLock upto 96 clients. However at 192 we
can see significant wait on ClogControlLock, but still not as bad as
we see on POWER.


>
> I've taken some time to build a simple web-based reports from the results
> collected so far (also included in the git repository), and pushed them
> here:
>
>     http://tvondra.bitbucket.org
>
> For each of the completed runs, there's a report comparing tps for different
> client counts with master and the three patches (average tps, median and
> stddev), and it's possible to download a more thorough text report with wait
> event stats, comparison of individual runs etc.

I saw your report, I think presenting it this way can give very clear idea.
>
> If you want to cooperate on this, I'm available - i.e. I can help you get
> the tooling running, customize it etc.

That will be really helpful, then next time I can also present my
reports in same format.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to