[ 
https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462450#comment-13462450
 ] 

Peter Schuller commented on CASSANDRA-4705:
-------------------------------------------

99% based on what time period? If period it too short, you won't get the full 
impact since you'll pollute the track record. If it's too large, consider the 
traffic increase resulting from a prolonged hiccup. Will you be able to hide 
typical GC pauses? Then you better have the window be higher than 250 ms. What 
about full gc:s? How do you determine what the p99 is given a node with 
multiple replica sets shared with it? If a single node goes into full gc, how 
do you make latency be un-affected while still capping the number of backup 
requests at a reasonable number? If you don't cap it, the optimization is more 
dangerous than useful, since it just means you'll fall over under various 
hard-to-predict emergent situations if you expect to take advantage of less 
reads when provisioning your cluster. What's an appropriate cap? How do you 
scale that with RF and consistency level? How do you explain this to the person 
who has to figure out how much capacity is needed for a cluster?

In our case, we pretty much run all our clusters with RR turned fully up - not 
necessarily for RR purposes, but for the purpose of more deterministic 
behavior. You don't want things falling over when a replica goas down. If you 
don't have the iops/CPU to take all replicas having to process all requests for 
a replica set, you're at risk of falling over (i.e., you don't scale, because 
failures are common in large clusters) - unless you over-provision, but then 
you might as well go all data reads to begin with.

I am not arguing against the idea of backup requests, but I *strongly* 
recommend simply going for the trivial and obvious route of full data reads 
*first* and getting the obvious pay-off with no increase in complexity (I would 
even argue it's a *decrease* in complexity in terms of the behavior of the 
system as a whole, especially from the perspective of a human understanding 
emergent cluster behavior) - and then slowly develop something like this, with 
very careful thought to all the edge cases and implications of it.

I'm in favor of long-term *predictable* performance. Full data reads is a very 
very easy way to achieve that, and vastly better latency, in many cases (the 
bandwidth saturation case pretty much being the major exception; CPU savings 
aren't really relevant with Cassandra's model if you expect to survive nodes 
being down). It's also very easy for a human to understand the behavior when 
looking at graphs of system behavior in some event, and trying to predict what 
will happen, or explain what did happen.

I really think the drawbacks of full data reads are being massively 
over-estimated and the implications of lack of data reads massively 
under-estimated.

                
> Speculative execution for CL_ONE
> --------------------------------
>
>                 Key: CASSANDRA-4705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>
> When read_repair is not 1.0, we send the request to one node for some of the 
> requests. When a node goes down or when a node is too busy the client has to 
> wait for the timeout before it can retry. 
> It would be nice to watch for latency and execute an additional request to a 
> different node, if the response is not received within average/99% of the 
> response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to 
> 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to