[jira] [Commented] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2015-11-24 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025012#comment-15025012
 ] 

Joshua McKenzie commented on CASSANDRA-7217:


[~tjake] to review.

> Native transport performance (with cassandra-stress) drops precipitously past 
> around 1000 threads
> -
>
> Key: CASSANDRA-7217
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Benedict
>Assignee: Ariel Weisberg
>  Labels: performance, stress, triaged
> Fix For: 3.0.1, 3.1
>
> Attachments: 2000-threads.svg, 500-threads.svg, FakeQuerySystem.java, 
> stub_server.diff
>
>
> This is obviously bad. Let's figure out why it's happening and put a stop to 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2015-11-24 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025016#comment-15025016
 ] 

T Jake Luciani commented on CASSANDRA-7217:
---

LGTM +1

> Native transport performance (with cassandra-stress) drops precipitously past 
> around 1000 threads
> -
>
> Key: CASSANDRA-7217
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Benedict
>Assignee: Ariel Weisberg
>  Labels: performance, stress, triaged
> Fix For: 3.0.1, 3.1
>
> Attachments: 2000-threads.svg, 500-threads.svg, FakeQuerySystem.java, 
> stub_server.diff
>
>
> This is obviously bad. Let's figure out why it's happening and put a stop to 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2015-11-16 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15007258#comment-15007258
 ] 

Ariel Weisberg commented on CASSANDRA-7217:
---

I was able to narrow this down to a configuration issue with the driver 
combined with less than perfect behavior if you don't run with this 
configuration. If I increase the maximum number of pending requests per 
connection from 128 to 256 then the performance at 1250 threads goes back to 
normal.

For stress we can do something smarter when setting this tunable to reflect the 
number of available threads. Generally if we have a thread submitting requests 
we would want it to default to having a pending request against the server 
otherwise all you are really benchmarking is the driver's ability to deal with 
pending requests.

Then there is separate driver issue of the degradation in performance when the 
number of pending requests is not high enough. I wouldn't expect that kind of 
drop off. Whether the request is pending at the client or languishing in a TCP 
buffer in the server shouldn't really matter. I haven't looked, but my guess is 
that when the driver reaches the limit the thread submitting a requests goes to 
sleep, and then it is woken up again. This means that every request has to flow 
through some extra scheduling points per request to account for this.

A better way is to always flatten the serialized request to a shared buffer and 
when the connection is ready to accept more work the network thread can wake up 
and write multiple requests to the server at once.

> Native transport performance (with cassandra-stress) drops precipitously past 
> around 1000 threads
> -
>
> Key: CASSANDRA-7217
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Assignee: Ariel Weisberg
>  Labels: performance, stress, triaged
> Fix For: 3.0.1, 3.1
>
> Attachments: 2000-threads.svg, 500-threads.svg, FakeQuerySystem.java, 
> stub_server.diff
>
>
> This is obviously bad. Let's figure out why it's happening and put a stop to 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2015-11-16 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15007275#comment-15007275
 ] 

Ariel Weisberg commented on CASSANDRA-7217:
---

Created https://datastax-oss.atlassian.net/browse/JAVA-992 for the Java 
suspected client driver issue.

> Native transport performance (with cassandra-stress) drops precipitously past 
> around 1000 threads
> -
>
> Key: CASSANDRA-7217
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Assignee: Ariel Weisberg
>  Labels: performance, stress, triaged
> Fix For: 3.0.1, 3.1
>
> Attachments: 2000-threads.svg, 500-threads.svg, FakeQuerySystem.java, 
> stub_server.diff
>
>
> This is obviously bad. Let's figure out why it's happening and put a stop to 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2015-11-12 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15003017#comment-15003017
 ] 

Ariel Weisberg commented on CASSANDRA-7217:
---

Performance counters

2000 threads
{code}
Results:
op rate   : 19419 [WRITE:19419]
partition rate: 19419 [WRITE:19419]
row rate  : 19419 [WRITE:19419]
latency mean  : 103.0 [WRITE:103.0]
latency median: 91.3 [WRITE:91.3]
latency 95th percentile   : 179.4 [WRITE:179.4]
latency 99th percentile   : 252.3 [WRITE:252.3]
latency 99.9th percentile : 428.5 [WRITE:428.5]
latency max   : 57651.8 [WRITE:57651.8]
Total partitions  : 1900 [WRITE:1900]
Total errors  : 0 [WRITE:0]
total gc count: 0
total gc mb   : 0
total gc time (s) : 0
avg gc time(ms)   : NaN
stdev gc time(ms) : 0
Total operation time  : 00:16:18
END

 Performance counter stats for './cassandra-stress write n=1900 -rate 
threads=2000 -mode native cql3 -node 192.168.1.9':

 3,320,451,421,007  cycles#2.192 GHz
 [15.41%]
 2,563,758,232,484  instructions  #0.77  insns per cycle

  #0.94  stalled cycles per 
insn [20.47%]
69,188,067,241  cache-references  #   45.664 M/sec  
 [25.56%]
13,456,198,724  cache-misses  #   19.449 % of all cache 
refs [30.60%]
   131,776,347,830  bus-cycles#   86.973 M/sec  
 [35.65%]
 2,415,412,133,089  idle-cycles-frontend  #   72.74% frontend cycles 
idle[40.69%]
 1,750,197,198,741  idle-cycles-backend   #   52.71% backend  cycles 
idle[45.75%]
1514363.238593  cpu-clock (msec)

1515146.390785  task-clock (msec) #1.530 CPUs utilized  

   154,815  page-faults   #0.102 K/sec  

87,357,050  cs#0.058 M/sec  

37,030,093  migrations#0.024 M/sec  

   154,691  minor-faults  #0.102 K/sec  

 0  major-faults  #0.000 K/sec  

 0  alignment-faults  #0.000 K/sec  

 0  emulation-faults  #0.000 K/sec  

   358,579,878,595  branch-instructions   #  236.664 M/sec  
 [45.74%]
 5,088,330,722  branch-misses #1.42% of all branches
 [45.80%]
70,350,080,393  L1-dcache-load-misses #   46.431 M/sec  
 [45.92%]
24,626,765,787  L1-dcache-store-misses#   16.254 M/sec  
 [40.88%]
19,812,757,638  L1-dcache-prefetch-misses #   13.076 M/sec  
 [40.97%]
59,285,911,291  L1-icache-load-misses #   39.129 M/sec  
 [40.92%]
 4,437,071,985  dTLB-load-misses  #2.928 M/sec  
 [40.90%]
   821,151,709  dTLB-store-misses #0.542 M/sec  
 [40.80%]
 1,188,402,914  iTLB-load-misses  #0.784 M/sec  
 [40.66%]
 5,274,857,779  branch-load-misses#3.481 M/sec  
 [40.58%]
39,293,189,238  LLC-loads #   25.934 M/sec  
 [40.47%]
10,625,403,856  LLC-stores#7.013 M/sec  
 [40.45%]
16,978,686,645  LLC-prefetches#   11.206 M/sec  
 [10.08%]

 990.019887601 seconds time elapsed
{code}
500 threads
{code}
Results:
op rate   : 63678 [WRITE:63678]
partition rate: 63678 [WRITE:63678]
row rate  : 63678 [WRITE:63678]
latency mean  : 7.8 [WRITE:7.8]
latency median: 5.6 [WRITE:5.6]
latency 95th percentile   : 16.8 [WRITE:16.8]
latency 99th percentile   : 36.5 [WRITE:36.5]
latency 99.9th percentile : 77.5 [WRITE:77.5]
latency max   : 358.8 [WRITE:358.8]
Total partitions  : 1900 [WRITE:1900]
Total errors  : 0 [WRITE:0]
total gc count: 0
total gc mb   : 0
total gc time (s) : 0
avg gc time(ms)   : NaN
stdev gc time(ms) : 0
Total operation time  : 00:04:58
END

 Performance counter stats for './cassandra-stress write n=1900 -rate 
threads=500 -mode native cql3 -node 192.168.1.9':

 2,055,138,822,781  cycles#2.519 GHz
 [15.25%]
 1,923,953,212,761  instructions  #0.94  insns per cycle

[jira] [Commented] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2015-11-12 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15003199#comment-15003199
 ] 

Ariel Weisberg commented on CASSANDRA-7217:
---

My takeaway from the counters is that with 2000 threads working through 19 
million writes took more instructions, almost double the number of cache 
references,  more than double the number of context switches, and double the 
number of dcache misses. So there was a big drop in efficiency that could 
explain how this occurs even without contention or starvation.

Now if there is a way to have 2000 threads do this work more efficiently is a 
good question. There are a lot more performance counters that might give 
insight into what having more threads changed as well as profiling. I'll look 
into it tomorrow.

> Native transport performance (with cassandra-stress) drops precipitously past 
> around 1000 threads
> -
>
> Key: CASSANDRA-7217
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Assignee: Ariel Weisberg
>  Labels: performance, stress, triaged
> Fix For: 3.1
>
>
> This is obviously bad. Let's figure out why it's happening and put a stop to 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2015-11-12 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15002303#comment-15002303
 ] 

Ariel Weisberg commented on CASSANDRA-7217:
---

I was able to reproduce this running the server on my OS X laptop and the 
client on my quad-core i5 Sandy Bridge Linux desktop.

With 500 threads I was getting 80k op/sec and with 2000 I was getting 30k 
op/sec.

I took flight recordings, but they are too big to look at and not that 
interesting. There is more contention detected with a 1 millisecond threshold 
at 500 threads then at 2000 threads presumably because with 500 threads so much 
more work is getting done.

CPU utilization at the client is pretty high at 500 threads, above 300%. 18k 
interrupts/second and 140k context switches/second.

With 2000 threads utilization is lower more towards 250% with closer to 10k 
interrupts/second, but 250-300k context switches/second.

My hypothesis is that having so many client threads is a problem for the Netty 
threads because there are more client threads then event threads by a large 
margin. With only one server there would really only be one since there is a 
single connection.

In cstar on bdplab I see a sharp drop between 1000 and 1250 threads. I would 
have expected a graceful slope and the overhead of context switching threads 
increases so there is still more to be explained.

> Native transport performance (with cassandra-stress) drops precipitously past 
> around 1000 threads
> -
>
> Key: CASSANDRA-7217
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Assignee: Ariel Weisberg
>  Labels: performance, stress, triaged
> Fix For: 3.1
>
>
> This is obviously bad. Let's figure out why it's happening and put a stop to 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2014-11-06 Thread Shawn Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14200604#comment-14200604
 ] 

Shawn Kumar commented on CASSANDRA-7217:


I'll be continuing testing on a more cpu-perfomant instance but thought I would 
briefly try the cstar_perf on bdplab. 
[Here|http://cstar.datastax.com/graph?stats=dd73c4a6-65d9-11e4-9413-bc764e04482cmetric=op_rateoperation=1_writesmoothing=1show_aggregates=truexmin=0xmax=279.07ymin=0ymax=120665.6]
 are the results - I increase the threads from 500 - 1500 in 250 thread 
increments from the first operation to the last and it seems like there is a 
noticeable drop.

 Native transport performance (with cassandra-stress) drops precipitously past 
 around 1000 threads
 -

 Key: CASSANDRA-7217
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Shawn Kumar
  Labels: performance, triaged
 Fix For: 2.1.2


 This is obviously bad. Let's figure out why it's happening and put a stop to 
 it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2014-05-15 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996595#comment-13996595
 ] 

Jason Brown commented on CASSANDRA-7217:


Do you think this is a problem on the stress side, or on the server side? Do 
you see a problem with thrift? Lastly, should I assume this arose due to 
testing your changes on #4718?

 Native transport performance (with cassandra-stress) drops precipitously past 
 around 1000 threads
 -

 Key: CASSANDRA-7217
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
  Labels: performance
 Fix For: 2.1.0


 This is obviously bad. Let's figure out why it's happening and put a stop to 
 it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2014-05-14 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997409#comment-13997409
 ] 

Benedict commented on CASSANDRA-7217:
-

I haven't investigated much at all, so I don't know the answer to any of these 
questions yet (except that it did indeed come about off the back of 
CASSANDRA-4718). The only thing I can say for sure is that it is unrelated to 
MaxRPC (i.e. nothing to do with native transport threads blocking on adding to 
the work queue).


 Native transport performance (with cassandra-stress) drops precipitously past 
 around 1000 threads
 -

 Key: CASSANDRA-7217
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
  Labels: performance
 Fix For: 2.1.0


 This is obviously bad. Let's figure out why it's happening and put a stop to 
 it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)