[jira] [Commented] (CASSANDRA-20743) Inflation for speculative retry 99% threshold if one replica is slow

Dmitry Konstantinov (Jira) Mon, 30 Jun 2025 14:45:06 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-20743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17987030#comment-17987030
 ]


Dmitry Konstantinov commented on CASSANDRA-20743:
-------------------------------------------------

I see what you mean—yes, there's no difference whether it's local replica A or 
remote replica B that's slow; the issue will still occur.

Regarding the dynamic snitch—yes, it's actually the reason why I have only 2% 
of requests are slow for a given replica in my example, rather than, say, 60%. 
This is because the dynamic snitch uses the 50th percentile (median) latency to 
score replicas, so higher-percentile latency issues aren't visible to it.

> Inflation for speculative retry 99% threshold if one replica is slow
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-20743
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20743
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Consistency/Coordination
>            Reporter: Dmitry Konstantinov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>
> I have executed a set of LOCAL_QUORUM read tests with 3 node Cassandra 
> cluster (4.1.4) when for one of the nodes a slow disk IO read is emulated 
> using a configured delay added to SSTable disk-level reads with a configured 
> probability. The purpose of these tests is to ensure that Cassandra does not 
> degrade a lot from latency point of view if a single replica is not healthy.
> During such tests I observe an interesting behaviour: drift/inflation for 
> speculative retry threshold value.
> We have a coordinator node, which is a replica as well. Let's assume we have 
> an injected read delay = 100ms with 2% probability within this node and 2 
> other nodes are healthy. Usual read is executed from the local node + one of 
> the remote nodes.
> Because of the introduced delay for 2% of requests we cross speculative retry 
> threshold value and run a speculative retry to the second remote replica.
> The speculative retry threshold value is calculated as a +coordinator 
> latency+ 99% by default; in these 2% cases the coordinator latency is 
> actually equal to time to wait till speculative retry + time to execute the 
> request to a remote replica, so we contribute this value back to our 
> coordinator latency metric and actually create a degradation feedback loop: 
> while the 2% delay for the local disk reads is in place the speculative retry 
> threshold value will grow in steps = time to execute the request to a remote 
> replica, degrading more and more.
> A possible WA is to use MIN(99p,Xms) speculative retry option introduced in 
> CASSANDRA-14293 but it is env specific, may depends on workload, so it can be 
> not so easy to define the right value for X..
> I have found the same issue reported for ScyllaDB - 
> [https://github.com/scylladb/scylladb/pull/8783] , to address it they started 
> to use replica read response times instead of a full coordinator read time 
> for speculative retry threshold value evaluation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-20743) Inflation for speculative retry 99% threshold if one replica is slow

Reply via email to