[ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263096#comment-14263096
 ] 

Ariel Weisberg commented on CASSANDRA-8457:
-------------------------------------------

i can't get performance counters for cache behaviors on EC2 as far as I can 
tell and I don't have a good answer for why I get the performance numbers I am 
seeing.

I ran the measurements with CL.QUORUM, ONE, and ALL against trunk and my branch 
with/without rpc_max_threads increased to 1024.

This was prompted by measurements on a 15 node cluster where CL.ONE was 10x 
faster then CL.ALL. I measured the full matrix on a 9 node cluster and CL.ONE 
was 5x faster than CL.ALL which with RF=5 is the expected performance delta. I 
definitely see under utilization. With CL.ONE run right at 1600% and with 
CL.ALL they don't make it up that high although trunk does better in that 
respect.

The under utilization is worse with the modified code that uses SEPExecutor. I 
maybe have to run with 15 nodes again to see if the jump from 9-15 is what 
causes CL.ALL to perform worse or if the difference is that I was using a 
placement group and 14.04 in the 9 node cluster.

The change to use SEPExecutor for writes was slightly slower to a lot slower in 
QUORUM and ALL cases at 9 nodes. I think that is a dead end, but I do wonder if 
that is because SEPExecutor might not have the same cache friendly behavior 
that running dedicated threads does. Dedicated threads require signaling and 
context switching, but thread scheduling policies could result in threads 
servicing each socket alway running in the same spot.

I am going to try again with netty. I should at least be able to match the 
performance of trunk with a non-blocking approach so I think it is still worth 
digging.

> nio MessagingService
> --------------------
>
>                 Key: CASSANDRA-8457
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Ariel Weisberg
>              Labels: performance
>             Fix For: 3.0
>
>
> Thread-per-peer (actually two each incoming and outbound) is a big 
> contributor to context switching, especially for larger clusters.  Let's look 
> at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to