[ https://issues.apache.org/jira/browse/CASSANDRA-20743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17987020#comment-17987020 ]
Jeremiah Jordan commented on CASSANDRA-20743: --------------------------------------------- I’m not sure only using the remote replica times changes much? You would see this same degradation if one of the remote replicas was very slow 2% of the time? I think this is just one of the problems with trying to use a percentile to do retries? > Inflation for speculative retry 99% threshold if one replica is slow > -------------------------------------------------------------------- > > Key: CASSANDRA-20743 > URL: https://issues.apache.org/jira/browse/CASSANDRA-20743 > Project: Apache Cassandra > Issue Type: Bug > Components: Consistency/Coordination > Reporter: Dmitry Konstantinov > Assignee: Dmitry Konstantinov > Priority: Normal > > I have executed a set of LOCAL_QUORUM read tests with 3 node Cassandra > cluster (4.1.4) when for one of the nodes a slow disk IO read is emulated > using a configured delay added to SSTable disk-level reads with a configured > probability. The purpose of these tests is to ensure that Cassandra does not > degrade a lot from latency point of view if a single replica is not healthy. > During such tests I observe an interesting behaviour: drift/inflation for > speculative retry threshold value. > We have a coordinator node, which is a replica as well. Let's assume we have > an injected read delay = 100ms with 2% probability within this node and 2 > other nodes are healthy. Usual read is executed from the local node + one of > the remote nodes. > Because of the introduced delay for 2% of requests we cross speculative retry > threshold value and run a speculative retry to the second remote replica. > The speculative retry threshold value is calculated as a +coordinator > latency+ 99% by default; in these 2% cases the coordinator latency is > actually equal to time to wait till speculative retry + time to execute the > request to a remote replica, so we contribute this value back to our > coordinator latency metric and actually create a degradation feedback loop: > while the 2% delay for the local disk reads is in place the speculative retry > threshold value will grow in steps = time to execute the request to a remote > replica, degrading more and more. > A possible WA is to use MIN(99p,Xms) speculative retry option introduced in > CASSANDRA-14293 but it is env specific, may depends on workload, so it can be > not so easy to define the right value for X.. > I have found the same issue reported for ScyllaDB - > [https://github.com/scylladb/scylladb/pull/8783] , to address it they started > to use replica read response times instead of a full coordinator read time > for speculative retry threshold value evaluation. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org