[ 
https://issues.apache.org/jira/browse/CASSANDRA-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16745238#comment-16745238
 ] 

Marcus Olsson commented on CASSANDRA-14983:
-------------------------------------------

I think I have found out why background traffic is required to reproduce this. 
In SEPExecutor#maybeExecuteImmediately() we try to take a work permit (and no 
task permit) but we 
[check|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/concurrent/SEPExecutor.java#L166]
 if there are task permits available in order to take the work permit. As the 
fast path does not add the task to the queue (#addTask() which also is the 
place where the task permits are increased) it might not have any task permits 
available. By changing the if statement to:
{code}
if (workPermits == 0 || (takeTaskPermit && taskPermits == 0))
 return false;
{code}
This made the fast path trigger more often. Actually when I added some metrics 
for this it seemed like the fast path was basically never used for the read 
path in my setup. With the if statement change I saw it used ~80% of the time 
with a thread count of 128 in stress. Unfortunately I did not see any 
performance difference when running tests on this with QUORUM.

But it did seem to have a large effect in a scenario with a single node and rf 
= 1 for 3.0 (graph_local_read.html). While running stress with 32 threads 
(pre-SEP-1, post-SEP-1) I could see a *~7%* throughput improvement locally.
When running stress with 128 threads (pre-SEP-128-1, post-SEP-128-1) the 
performance dropped slightly. The median latency seems lower but the higher 
percentiles are taking a hit.
For the final two tests (pre-SEP-128-cr128-1, post-SEP-128-cr128-1) I decided 
to increase the concurrent_read threads from 32 -> 128. The latency results are 
similar to the previous run but the throughput seems to have increased.
Note: I ran _echo 3 > /proc/sys/vm/drop_caches_ before these tests to clear the 
page cache, etc. which is why there is a large buildup in the beginning.

I also made a similar quick test for trunk (graph_local_read_trunk.html) where 
it seems to be a throughput improvement when using a low thread count. But the 
overall performance seems to have decreased with a default CCM setup (unless my 
environment was behaving oddly).

I think this could warrant it's own JIRA ticket to investigate more.

> Local reads potentially blocking remote reads
> ---------------------------------------------
>
>                 Key: CASSANDRA-14983
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14983
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Coordination
>            Reporter: Marcus Olsson
>            Priority: Minor
>         Attachments: local_read_trace.log
>
>
> Since CASSANDRA-4718 there is a fast path allowing local requests to continue 
> to [work in the same 
> thread|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/reads/AbstractReadExecutor.java#L157]
>  rather than being sent over to the read stage.
> Based on the comment
> {code:java}
> // We delay the local (potentially blocking) read till the end to avoid 
> stalling remote requests.
> {code}
> it seems like this should be performed last in the chain to avoid blocking 
> remote requests but that does not seem to be the case when the local request 
> is a data request. The digest request(s) are sent after the data requests are 
> sent (and now the transient replica requests as well). When the fast path is 
> used for local data/transient data requests this will block the next type of 
> request from being sent away until the local read is finished and add 
> additional latency to the request.
> In addition to this it seems like local requests are *always* data requests 
> (might not be a problem), but the log message can say either ["digest" or 
> "data"|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/reads/AbstractReadExecutor.java#L156]
>  as the type of request.
> I have tried to run performance measurements to see the impact of this in 3.0 
> (by moving local requests to the end of ARE#executeAsync()) but I haven't 
> seen any big difference yet. I'll continue to run some more tests to see if I 
> can find a use case affected by this.
> Attaching a trace (3.0) where this happens. Reproduction:
>  # Create a three node CCM cluster
>  # Provision data with stress (rf=3)
>  # In parallel:
>  ## Start stress read run
>  ## Run multiple manual read queries in cqlsh with tracing on and 
> local_quorum (as this does not always happen)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to