[
https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13540173#comment-13540173
]
Jonathan Ellis commented on CASSANDRA-4705:
-------------------------------------------
1. Can you pull the AbstractReadExecutor refactor into a separate commit? That
is, just the introduction of ARE and DRE, then SRE would be added with the rest
of the changes here in the "main"commit.
2. This looks problematic to me, since if the first callback is high-latency we
won't send out extra requests for low-latency callbacks promptly.
{code}
. for (AbstractReadExecutor exec: readCallbacks)
exec.speculate();
{code}
Would just sorting by expected latency be enough to fix this?
3. Should split latency tracking into, at least, single-row reads vs index/seq
scans. Can we go a step farther and track by PreparedStatement? (Thrift
singlerow/scan ops would have to be lumped into one bucket each, still.) This
can be pushed into a separate ticket.
Nits:
- ReadCallback.get only appears to be called with command.timeout, so pulling
that out into a parameter looks like premature generalization
- Making the extra call to isSignaled in RC.get is probably a pessimization
since it is also synchronized
- missing @Override annotations for ARE subclasses
> Speculative execution for Reads
> -------------------------------
>
> Key: CASSANDRA-4705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Vijay
> Assignee: Vijay
> Fix For: 2.0
>
> Attachments: 0001-CASSANDRA-4705.patch, 0001-CASSANDRA-4705-v2.patch,
> 0001-CASSANDRA-4705-v3.patch
>
>
> When read_repair is not 1.0, we send the request to one node for some of the
> requests. When a node goes down or when a node is too busy the client has to
> wait for the timeout before it can retry.
> It would be nice to watch for latency and execute an additional request to a
> different node, if the response is not received within average/99% of the
> response times recorded in the past.
> CASSANDRA-2540 might be able to solve the variance when read_repair is set to
> 1.0
> 1) May be we need to use metrics-core to record various Percentiles
> 2) Modify ReadCallback.get to execute additional request speculatively.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira