[ 
https://issues.apache.org/jira/browse/CASSANDRA-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16992084#comment-16992084
 ] 

Yifan Cai commented on CASSANDRA-15442:
---------------------------------------

Thanks you for the review, [~bdeggleston]

 
{quote} I'd expect repair writes to timeout on the write timeout, or whatever 
is remaining on the read timeout. Whichever is less.
{quote}
I thought of it.

Since the mutation is part of the read, it should take whatever the remaining 
timeout is. If adjusting the timeout of mutation to {{Min(WriteTimeout, 
RemainingTimeout)}}, it is possible for client to see a request error out 
before reaching to the read timeout, which is unexpected. It also means server 
could allow some slow mutations to stay longer than the globally configured 
write timeout, which may impact throughput. But from the client perspective, if 
the read is permitted with this amount of time, the server should allow it. 

I am all ears if server should stop a read request early. 

Updated the PR to address the comments. Since I did not change the write 
timeout to {{Min(WriteTimeout, RemainingTimeout)}}, I did not add the "setting 
the write timeout to a value 2-3x above the read timeout."

> Read repair implicitly increases read timeout value
> ---------------------------------------------------
>
>                 Key: CASSANDRA-15442
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15442
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/Core
>            Reporter: Yifan Cai
>            Assignee: Yifan Cai
>            Priority: Normal
>
> When read repair occurs during a read, internally, it starts several 
> _blocking_ operations in sequence. See 
> {{org.apache.cassandra.service.StorageProxy#fetchRows}}. 
>  The timeline of the blocking operations
>  # Regular read, wait for full data/digest read response to complete. 
> {{reads[*].awaitResponses();}}
>  # Read repair read, wait for full data read response to complete. 
> {{reads[*].awaitReadRepair();}}
>  # Read repair write, wait for write response to complete. 
> {{concatAndBlockOnRepair(results, repairs);}}
> Step 1 and 2 share the same timeout, and wait for the duration of read 
> timeout, say 5 s.
> Step 3 waits for the duration of write timeout, say 2 s.
> In the worse case, the actual time taken for a read could accumulate to ~7 s, 
> if each individual step does not exceed the timeout value.
> From the client perspective, it may not expect a request taken higher than 
> the database configured timeout value. 
> Such scenario is especially bad for the clients that have set up client-side 
> timeout monitoring close to the configured one. The clients think the 
> operations timed out and abort, but they are in fact still running on server.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to