[ 
https://issues.apache.org/jira/browse/CASSANDRA-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16989238#comment-16989238
 ] 

Yifan Cai commented on CASSANDRA-15442:
---------------------------------------

[~bdeggleston], thanks! 
The patch is ready. 

||Code||PR||Unit test||JVM dtest||
|[Code|https://github.com/yifan-c/cassandra/tree/CASSANDRA-15442-read-repair-timeout-fix]|[PR|https://github.com/apache/cassandra/pull/391]|[Unit
 test|https://app.circleci.com/jobs/github/yifan-c/cassandra/113]|[JVM 
dtesrt|https://app.circleci.com/jobs/github/yifan-c/cassandra/112]|

Briefly, the patch does
# In the BlockingReadRepair, repair process wait based on the read timeout 
value.
# Added awaitRepairsUntil to accept a future time to timeout.
# Added timeout test in dtest.


> Read repair implicitly increases read timeout value
> ---------------------------------------------------
>
>                 Key: CASSANDRA-15442
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15442
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/Core
>            Reporter: Yifan Cai
>            Assignee: Yifan Cai
>            Priority: Normal
>
> When read repair occurs during a read, internally, it starts several 
> _blocking_ operations in sequence. See 
> {{org.apache.cassandra.service.StorageProxy#fetchRows}}. 
>  The timeline of the blocking operations
>  # Regular read, wait for full data/digest read response to complete. 
> {{reads[*].awaitResponses();}}
>  # Read repair read, wait for full data read response to complete. 
> {{reads[*].awaitReadRepair();}}
>  # Read repair write, wait for write response to complete. 
> {{concatAndBlockOnRepair(results, repairs);}}
> Step 1 and 2 share the same timeout, and wait for the duration of read 
> timeout, say 5 s.
> Step 3 waits for the duration of write timeout, say 2 s.
> In the worse case, the actual time taken for a read could accumulate to ~7 s, 
> if each individual step does not exceed the timeout value.
> From the client perspective, it may not expect a request taken higher than 
> the database configured timeout value. 
> Such scenario is especially bad for the clients that have set up client-side 
> timeout monitoring close to the configured one. The clients think the 
> operations timed out and abort, but they are in fact still running on server.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to