[
https://issues.apache.org/jira/browse/CASSANDRA-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16988141#comment-16988141
]
Yifan Cai commented on CASSANDRA-15442:
---------------------------------------
The read repair needs to be blocking to guarantee monotonic read, per
CASSANDRA-2494.
According to the discussion at CASSANDRA-14635, making the repair (write) async
is not a considered use case.
----
h4. Proposed Fix
To respect the timeout that client is expecting, in each step, the blocking
operation and the internode messagings should only use the *remaining* timeout.
The write part at repair is still part of the read, so it should share the same
timeout.
The step 2 in the timeline already adjusts the internode requests timeout to
the remaining.
The proposed fix argues the step 3 should also use the remaining timeout,
instead of using a separate {{WriteRPCTimeout.}}
h4. The Impact
- The read timeout (due to blocking read repair) may occur more frequently if
using the existing {{ReadRPCTimeout}}. The read timeout may need to be
configured higher to allow the blocking read repair to complete. In fact, the
timeout is increased to reflect the actual time taken. (The time for write is
just not counted in read as of now)
- Increasing the read timeout allows the genuine slow read queries (but no read
repair) to stay longer and negatively impact throughput.
> Read repair implicitly increases read timeout value
> ---------------------------------------------------
>
> Key: CASSANDRA-15442
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15442
> Project: Cassandra
> Issue Type: Bug
> Components: Legacy/Core
> Reporter: Yifan Cai
> Assignee: Yifan Cai
> Priority: Normal
>
> When read repair occurs during a read, internally, it starts several
> _blocking_ operations in sequence. See
> {{org.apache.cassandra.service.StorageProxy#fetchRows}}.
> The timeline of the blocking operations
> # Regular read, wait for full data/digest read response to complete.
> {{reads[*].awaitResponses();}}
> # Read repair read, wait for full data read response to complete.
> {{reads[*].awaitReadRepair();}}
> # Read repair write, wait for write response to complete.
> {{concatAndBlockOnRepair(results, repairs);}}
> Step 1 and 2 each waits for the duration of read timeout, say 5 s.
> Step 3 waits for the duration of write timeout, say 2 s.
> In the worse case, the actual time taken for a read could accumulate to ~12
> s, if each individual step does not exceed the timeout value.
> From the client perspective, it does not expect a request taken way higher
> than the database configured timeout value.
> Such scenario is especially bad for the clients that have set up client-side
> timeout monitoring close to the configured one. The clients think the
> operations timed out and abort, but they are in fact still running on server.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]