Howard Chu wrote:
Luke Kenneth Casson Leighton wrote:
http://symas.com/mdb/inmem/scaling.html

can i make the suggestion that, whilst i am aware that it is generally
not recommended for production environments to run more processes than
there are cores, you try running 128, 256 and even 512 processes all
hitting that 64-core system, and monitor its I/O usage (iostats) and
loadavg whilst doing so?

Sure, I can conduct that test and collect system stats using atop. Will let ya
know. By the way, we're using threads here, not processes. But the overall
loading behavior should be the same either way.

the hypothesis to test is that the performance, which should scale
reasonably linearly downwards as a ratio of the number of processes to
the number of cores, instead drops like a lead balloon.

Threads         Run Time                                CPU %   DB Size         
Process Size    Context Switc   Write   Read
        Wall            User            Sys                                     
                Vol     Invol           
1       10:01.39        00:19:39.18     00:00:21.00     199     12647888        
12650460        21      1513    45605   275331
2       10:01.38        00:29:35.21     00:00:24.33     299     12647888        
12650472        48      2661    42726   528514
4       10:01.37        00:49:32.93     00:00:25.42     498     12647888        
12650496        84      4106    40961   1068050
8       10:01.36        01:29:32.68     00:00:23.25     897     12647888        
12650756        157     7738    38812   2058741
16      10:01.36        02:49:22.44     00:00:28.51     1694    12647888        
12650852        345     16941   33357   3857045
32      10:01.36        05:28:35.39     00:01:02.69     3288    12647888        
12651308        923     258250  23922   6091558
64      10:01.38        10:35:44.42     00:01:51.69     6361    12648060        
12652132        1766    145585  16571   8724687
128     10:01.38        10:36:43.09     00:01:45.52     6368    12649296        
12654928        3276    2906109 8594    9846720
256     10:01.48        10:36:53.05     00:01:36.83     6369    12649304        
12658056        5365    3557137 4178    10453540
512     10:02.11        10:36:09.58     00:03:00.83     6369    12649320        
12664304        8303    3511456 1947    10728221

Looks to me like the system was reasonably well behaved.

This is reusing a DB that had already had multiple iterations of this
benchmark run on it, so the size is larger than for a fresh DB, and it
would have significant internal fragmentation - i.e., a lot of sequential
data will be in non-adjacent pages.

The only really obvious impact is that the number of involuntary context
switches jumps up at 128 threads, which is what you'd expect since there
are fewer cores than threads. The writer gets progressively starved, and
read rates increase slightly.

--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Reply via email to