[
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ariel Weisberg updated CASSANDRA-7217:
--------------------------------------
Attachment: 2000-threads.svg
500-threads.svg
Attempt at flame graphs. {{perf record}} was kind of unhappy with 2k thread and
demolished a core and throughput went down a nice chunk.
Also have more counters. Interestingly there are 3x as many system calls with
2k threads, but I wasn't counting individual system calls so I don't know what
that time was spent doing.
{code}
Results:
op rate : 51730 [WRITE:51730]
partition rate : 51730 [WRITE:51730]
row rate : 51730 [WRITE:51730]
latency mean : 9.6 [WRITE:9.6]
latency median : 7.3 [WRITE:7.3]
latency 95th percentile : 19.7 [WRITE:19.7]
latency 99th percentile : 39.2 [WRITE:39.2]
latency 99.9th percentile : 79.0 [WRITE:79.0]
latency max : 1119.0 [WRITE:1119.0]
Total partitions : 19000000 [WRITE:19000000]
Total errors : 0 [WRITE:0]
total gc count : 0
total gc mb : 0
total gc time (s) : 0
avg gc time(ms) : NaN
stdev gc time(ms) : 0
Total operation time : 00:06:07
END
Performance counter stats for './cassandra-stress write n=19000000 -rate
threads=500 -mode native cql3 -node 192.168.1.9':
2,515,431,304,667 cycles # 2.441 GHz
[16.05%]
2,187,585,695,677 instructions # 0.87 insns per cycle
# 0.79 stalled cycles per
insn [21.35%]
41,594,524,794 cache-references # 40.359 M/sec
[26.62%]
10,229,632,567 cache-misses # 24.594 % of all cache
refs [31.87%]
1,731,248,021,308 idle-cycles-frontend # 68.83% frontend cycles
idle [37.10%]
1,171,017,181,584 idle-cycles-backend # 46.55% backend cycles
idle [42.33%]
1030610.942568 cpu-clock (msec)
1030611.990849 task-clock (msec) # 2.721 CPUs utilized
91,892 page-faults # 0.089 K/sec
32,039,387 cs # 0.031 M/sec
1,584,549 migrations # 0.002 M/sec
91,537 minor-faults # 0.089 K/sec
271,307,683,357 branch-instructions # 263.249 M/sec
[47.56%]
2,777,454,782 branch-misses # 1.02% of all branches
[47.48%]
52,738,483,134 L1-dcache-load-misses # 51.172 M/sec
[47.47%]
18,283,685,960 L1-dcache-store-misses # 17.741 M/sec
[47.47%]
15,205,838,312 L1-dcache-prefetch-misses # 14.754 M/sec
[42.21%]
39,417,616,920 L1-icache-load-misses # 38.247 M/sec
[42.28%]
2,630,018,507 dTLB-load-misses # 2.552 M/sec
[42.33%]
480,675,134 dTLB-store-misses # 0.466 M/sec
[42.38%]
567,242,554 iTLB-load-misses # 0.550 M/sec
[42.43%]
2,779,803,804 branch-load-misses # 2.697 M/sec
[42.48%]
23,639,238,796 LLC-loads # 22.937 M/sec
[42.51%]
4,117,222,926 LLC-stores # 3.995 M/sec
[42.51%]
11,311,004,882 LLC-prefetches # 10.975 M/sec
[10.60%]
46,798,741 sched:sched_wakeup # 0.045 M/sec
1,284 sched:sched_wakeup_new # 0.001 K/sec
32,039,387 sched:sched_switch # 0.031 M/sec
1,585,242 sched:sched_migrate_task # 0.002 M/sec
643 sched:sched_process_exit # 0.001 K/sec
0 sched:sched_wait_task # 0.000 K/sec
4 sched:sched_process_wait # 0.000 K/sec
642 sched:sched_process_fork # 0.001 K/sec
5 sched:sched_process_exec # 0.000 K/sec
9,609,960,131,198 sched:sched_stat_wait # 9324.518 M/sec
388,048,782,246,881 sched:sched_stat_sleep # 376522.674 M/sec
40,039,320 sched:sched_stat_iowait # 0.039 M/sec
8,678,041,742 sched:sched_stat_blocked # 8.420 M/sec
1,032,643,169,845 sched:sched_stat_runtime # 1001.971 M/sec
272,903 sched:sched_wake_idle_without_ipi # 0.265 K/sec
2,269 workqueue:workqueue_queue_work # 0.002 K/sec
2,269 workqueue:workqueue_activate_work # 0.002 K/sec
2,185,692 irq:irq_handler_entry # 0.002 M/sec
2,185,692 irq:irq_handler_exit # 0.002 M/sec
2,809,612 irq:softirq_entry # 0.003 M/sec
2,809,612 irq:softirq_exit # 0.003 M/sec
2,834,592 irq:softirq_raise # 0.003 M/sec
642 task:task_newtask # 0.001 K/sec
5 task:task_rename # 0.000 K/sec
84,774 exceptions:page_fault_user # 0.082 K/sec
7,118 exceptions:page_fault_kernel # 0.007 K/sec
84,530,358 raw_syscalls:sys_enter # 0.082 M/sec
84,530,372 raw_syscalls:sys_exit # 0.082 M/sec
378.751202925 seconds time elapsed
Results:
op rate : 12919 [WRITE:12919]
partition rate : 12919 [WRITE:12919]
row rate : 12919 [WRITE:12919]
latency mean : 154.9 [WRITE:154.9]
latency median : 83.2 [WRITE:83.2]
latency 95th percentile : 366.2 [WRITE:366.2]
latency 99th percentile : 579.3 [WRITE:579.3]
latency 99.9th percentile : 872.4 [WRITE:872.4]
latency max : 2396.9 [WRITE:2396.9]
Total partitions : 19000000 [WRITE:19000000]
Total errors : 0 [WRITE:0]
total gc count : 0
total gc mb : 0
total gc time (s) : 0
avg gc time(ms) : NaN
stdev gc time(ms) : 0
Total operation time : 00:24:30
END
Performance counter stats for './cassandra-stress write n=19000000 -rate
threads=2000 -mode native cql3 -node 192.168.1.9':
5,181,129,425,325 cycles # 2.169 GHz
[16.10%]
3,612,561,477,198 instructions # 0.70 insns per cycle
# 1.06 stalled cycles per
insn [21.43%]
125,303,440,058 cache-references # 52.445 M/sec
[26.76%]
31,162,640,970 cache-misses # 24.870 % of all cache
refs [32.03%]
3,846,338,739,881 idle-cycles-frontend # 74.24% frontend cycles
idle [37.28%]
2,862,782,268,966 idle-cycles-backend # 55.25% backend cycles
idle [42.49%]
2387975.425875 cpu-clock (msec)
2389248.103454 task-clock (msec) # 1.610 CPUs utilized
159,955 page-faults # 0.067 K/sec
79,702,888 cs # 0.033 M/sec
44,298,673 migrations # 0.019 M/sec
159,869 minor-faults # 0.067 K/sec
641,996,673,767 branch-instructions # 268.702 M/sec
[47.76%]
6,698,884,506 branch-misses # 1.04% of all branches
[47.72%]
140,604,450,350 L1-dcache-load-misses # 58.849 M/sec
[47.69%]
37,276,664,029 L1-dcache-store-misses # 15.602 M/sec
[47.72%]
30,237,145,992 L1-dcache-prefetch-misses # 12.656 M/sec
[42.39%]
69,945,237,261 L1-icache-load-misses # 29.275 M/sec
[42.47%]
5,669,464,361 dTLB-load-misses # 2.373 M/sec
[42.51%]
1,086,499,138 dTLB-store-misses # 0.455 M/sec
[42.55%]
1,042,804,417 iTLB-load-misses # 0.436 M/sec
[42.52%]
6,685,073,730 branch-load-misses # 2.798 M/sec
[42.53%]
87,031,289,561 LLC-loads # 36.426 M/sec
[42.55%]
11,988,446,940 LLC-stores # 5.018 M/sec
[42.46%]
29,826,259,945 LLC-prefetches # 12.484 M/sec
[10.58%]
154,542,169 sched:sched_wakeup # 0.065 M/sec
4,284 sched:sched_wakeup_new # 0.002 K/sec
79,702,888 sched:sched_switch # 0.033 M/sec
44,301,003 sched:sched_migrate_task # 0.019 M/sec
2,143 sched:sched_process_exit # 0.001 K/sec
0 sched:sched_wait_task # 0.000 K/sec
4 sched:sched_process_wait # 0.000 K/sec
2,142 sched:sched_process_fork # 0.001 K/sec
5 sched:sched_process_exec # 0.000 K/sec
448,770,334,535 sched:sched_stat_wait # 187.829 M/sec
5,998,250,681,675,102 sched:sched_stat_sleep # 2510518.130 M/sec
154,752,649 sched:sched_stat_iowait # 0.065 M/sec
49,347,196,951 sched:sched_stat_blocked # 20.654 M/sec
3,155,194,145,918 sched:sched_stat_runtime # 1320.580 M/sec
71,084,628 sched:sched_wake_idle_without_ipi # 0.030 M/sec
6,089 workqueue:workqueue_queue_work # 0.003 K/sec
6,089 workqueue:workqueue_activate_work # 0.003 K/sec
981,847 irq:irq_handler_entry # 0.411 K/sec
981,847 irq:irq_handler_exit # 0.411 K/sec
2,690,966 irq:softirq_entry # 0.001 M/sec
2,690,966 irq:softirq_exit # 0.001 M/sec
2,701,243 irq:softirq_raise # 0.001 M/sec
2,142 task:task_newtask # 0.001 K/sec
5 task:task_rename # 0.000 K/sec
152,187 exceptions:page_fault_user # 0.064 K/sec
7,768 exceptions:page_fault_kernel # 0.003 K/sec
232,031,226 raw_syscalls:sys_enter # 0.097 M/sec
232,031,239 raw_syscalls:sys_exit # 0.097 M/sec
1483.891153703 seconds time elapsed
{code}
> Native transport performance (with cassandra-stress) drops precipitously past
> around 1000 threads
> -------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-7217
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
> Project: Cassandra
> Issue Type: Bug
> Reporter: Benedict
> Assignee: Ariel Weisberg
> Labels: performance, stress, triaged
> Fix For: 3.1
>
> Attachments: 2000-threads.svg, 500-threads.svg
>
>
> This is obviously bad. Let's figure out why it's happening and put a stop to
> it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)