[ 
https://issues.apache.org/jira/browse/CASSANDRA-6995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962407#comment-13962407
 ] 

Jason Brown commented on CASSANDRA-6995:
----------------------------------------

No, I did not test this with CQL3 nor native client; only tested with with 
thrift and other clients which know how to route correctly. I would argue, 
though, why not give the optimization to clients who know what they are doing 
(or happen to get a lucky via round-robin)? 

As to the concurrent_reads value in the yaml, that's an interesting 
argument/angle on the issue. Not sure how thrilled I am at introducing an 
explicit semaphore into the read path (you kind of have an implicit semaphore 
by the size of the read stage executor pool), but I can see the argument for 
respecting the concurrent_reads yaml value. I think I can add something in 
relatively quickly and test out.

bq. In the nearish future it may be possible to speculatively execute on the 
assumption the data is in memory....

Not sure what this means or how you would know something is in memory (without 
something like mincore), but choosing to read on the request thread shouldn't 
depend on that knowledge. It's not that kind of tradeoff we're trying to win 
with this ticket.


> Execute local ONE/LOCAL_ONE reads on request thread instead of dispatching to 
> read stage
> ----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-6995
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6995
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jason Brown
>            Assignee: Jason Brown
>            Priority: Minor
>              Labels: performance
>             Fix For: 2.0.7
>
>         Attachments: 6995-v1.diff, syncread-stress.txt
>
>
> When performing a read local to a coordinator node, AbstractReadExecutor will 
> create a new SP.LocalReadRunnable and drop it into the read stage for 
> asynchronous execution. If you are using a client that intelligently routes  
> read requests to a node holding the data for a given request, and are using 
> CL.ONE/LOCAL_ONE, the enqueuing SP.LocalReadRunnable and waiting for the 
> context switches (and possible NUMA misses) adds unneccesary latency. We can 
> reduce that latency and improve throughput by avoiding the queueing and 
> thread context switching by simply executing the SP.LocalReadRunnable 
> synchronously in the request thread. Testing on a three node cluster (each 
> with 32 cpus, 132 GB ram) yields ~10% improvement in throughput and ~20% 
> speedup on avg/95/99 percentiles (99.9% was about 5-10% improvement).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to