[
https://issues.apache.org/jira/browse/CASSANDRA-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032696#comment-13032696
]
Jonathan Ellis commented on CASSANDRA-2034:
-------------------------------------------
bq. there is the potential to take us back to the Bad Old Days when HH could
cause cascading failure
To elaborate, the scenario here is, we did a write that succeeded on some
nodes, but not others. So we need to write a local hint to replay to the
down-or-slow nodes later. But, those nodes being down-or-slow mean load has
increased on the rest of the cluster, and writing the extra hint will increase
that further, possibly enough that other nodes will see this coordinator as
down-or-slow, too, and so on.
So I think what we want to do, with this option on, is to attempt the hint
write but if we can't do it in a reasonable time, throw back a
TimedOutException which is already our signal that "your cluster may be
overloaded, you need to back off."
Specifically, we could add a separate executor here, with a blocking, capped
queue. When we go to do a hint-after-failure we enqueue the write but if it is
rejected because queue is full we throw the TOE. Otherwise, we wait for the
write and then return success to the client.
The tricky part is the queue needs to be large enough to handle load spikes but
small enough that wait-for-success-post-enqueue is negligible compared to
RpcTimeout. If we had different timeouts for writes than reads (which we don't
-- CASSANDRA-959) then it might be nice to use say 80% of the timeout for the
normal write, and reserve 20% for the hint phase.
> Make Read Repair unnecessary when Hinted Handoff is enabled
> -----------------------------------------------------------
>
> Key: CASSANDRA-2034
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2034
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Fix For: 1.0
>
> Original Estimate: 8h
> Remaining Estimate: 8h
>
> Currently, HH is purely an optimization -- if a machine goes down, enabling
> HH means RR/AES will have less work to do, but you can't disable RR entirely
> in most situations since HH doesn't kick in until the FailureDetector does.
> Let's add a scheduled task to the mutate path, such that we return to the
> client normally after ConsistencyLevel is achieved, but after RpcTimeout we
> check the responseHandler write acks and write local hints for any missing
> targets.
> This would making disabling RR when HH is enabled a much more reasonable
> option, which has a huge impact on read throughput.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira