Howard Chu wrote:
Luke Kenneth Casson Leighton wrote:
http://symas.com/mdb/inmem/scaling.html
can i make the suggestion that, whilst i am aware that it is generally
not recommended for production environments to run more processes than
there are cores, you try running 128, 256 and even 512 processes all
hitting that 64-core system, and monitor its I/O usage (iostats) and
loadavg whilst doing so?
Sure, I can conduct that test and collect system stats using atop. Will let ya
know. By the way, we're using threads here, not processes. But the overall
loading behavior should be the same either way.
the hypothesis to test is that the performance, which should scale
reasonably linearly downwards as a ratio of the number of processes to
the number of cores, instead drops like a lead balloon.
Threads Run Time CPU % DB Size
Process Size Context Switc Write Read
Wall User Sys
Vol Invol
1 10:01.39 00:19:39.18 00:00:21.00 199 12647888
12650460 21 1513 45605 275331
2 10:01.38 00:29:35.21 00:00:24.33 299 12647888
12650472 48 2661 42726 528514
4 10:01.37 00:49:32.93 00:00:25.42 498 12647888
12650496 84 4106 40961 1068050
8 10:01.36 01:29:32.68 00:00:23.25 897 12647888
12650756 157 7738 38812 2058741
16 10:01.36 02:49:22.44 00:00:28.51 1694 12647888
12650852 345 16941 33357 3857045
32 10:01.36 05:28:35.39 00:01:02.69 3288 12647888
12651308 923 258250 23922 6091558
64 10:01.38 10:35:44.42 00:01:51.69 6361 12648060
12652132 1766 145585 16571 8724687
128 10:01.38 10:36:43.09 00:01:45.52 6368 12649296
12654928 3276 2906109 8594 9846720
256 10:01.48 10:36:53.05 00:01:36.83 6369 12649304
12658056 5365 3557137 4178 10453540
512 10:02.11 10:36:09.58 00:03:00.83 6369 12649320
12664304 8303 3511456 1947 10728221
Looks to me like the system was reasonably well behaved.
This is reusing a DB that had already had multiple iterations of this
benchmark run on it, so the size is larger than for a fresh DB, and it
would have significant internal fragmentation - i.e., a lot of sequential
data will be in non-adjacent pages.
The only really obvious impact is that the number of involuntary context
switches jumps up at 128 threads, which is what you'd expect since there
are fewer cores than threads. The writer gets progressively starved, and
read rates increase slightly.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/