[ 
https://issues.apache.org/jira/browse/CASSANDRA-6995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13968892#comment-13968892
 ] 

Benedict commented on CASSANDRA-6995:
-------------------------------------

bq. In a separate note shouldn't we throttle on the number of disk read from 
the disk instead of concurrent_writers and reads? its un fair on the threads 
when one request pulls a lot more data than others....

Agreed, [[email protected]]. I would quite like to see the write and read 
stages dropped entirely, and introduce a stage purely for processing (both read 
and write) requests coming from other nodes. Direct clients can just execute 
reads/writes directly in their thread. We can then have a synchronisation 
primitive on each disk for taking permission to perform a rebuffer of data in 
our readers. Ideally this would dovetail with work in CASSANDRA-5863 to provide 
an in-process page cache that could be used not just for compressed files. This 
way all direct clients could perform in-memory queries without any 
synchronisation / blocking / context switching.

> Execute local ONE/LOCAL_ONE reads on request thread instead of dispatching to 
> read stage
> ----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-6995
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6995
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jason Brown
>            Assignee: Jason Brown
>            Priority: Minor
>              Labels: performance
>             Fix For: 2.0.7
>
>         Attachments: 6995-v1.diff, syncread-stress.txt
>
>
> When performing a read local to a coordinator node, AbstractReadExecutor will 
> create a new SP.LocalReadRunnable and drop it into the read stage for 
> asynchronous execution. If you are using a client that intelligently routes  
> read requests to a node holding the data for a given request, and are using 
> CL.ONE/LOCAL_ONE, the enqueuing SP.LocalReadRunnable and waiting for the 
> context switches (and possible NUMA misses) adds unneccesary latency. We can 
> reduce that latency and improve throughput by avoiding the queueing and 
> thread context switching by simply executing the SP.LocalReadRunnable 
> synchronously in the request thread. Testing on a three node cluster (each 
> with 32 cpus, 132 GB ram) yields ~10% improvement in throughput and ~20% 
> speedup on avg/95/99 percentiles (99.9% was about 5-10% improvement).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to