[jira] [Comment Edited] (CASSANDRA-14715) Read repairs can result in bogus timeout errors to the client

Stefan Miklosovic (Jira) Fri, 24 Feb 2023 03:56:12 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-14715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17693173#comment-17693173
 ]


Stefan Miklosovic edited comment on CASSANDRA-14715 at 2/24/23 11:55 AM:
-------------------------------------------------------------------------

[~jaid] is this happening in 4.0 too? I am trying to figure that out. The logic 
around this stuff was rewritten there.

[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/reads/repair/BlockingReadRepair.java#L82-L106]

Maybe [~benedict]  would know this as he was the last one touched that code in 
CASSANDRA-14740.


was (Author: smiklosovic):
[~jaid] is this happening in 4.0 too? I am trying to figure that out. The logic 
around this stuff was rewritten there.

https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/reads/repair/BlockingReadRepair.java#L82-L106

> Read repairs can result in bogus timeout errors to the client
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-14715
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14715
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/Local Write-Read Paths
>            Reporter: Cameron Zemek
>            Priority: Low
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> In RepairMergeListener:close() it does the following:
>  
> {code:java}
> try
> {
>     FBUtilities.waitOnFutures(repairResults, 
> DatabaseDescriptor.getWriteRpcTimeout());
> }
> catch (TimeoutException ex)
> {
>     // We got all responses, but timed out while repairing
>     int blockFor = consistency.blockFor(keyspace);
>     if (Tracing.isTracing())
>         Tracing.trace("Timed out while read-repairing after receiving all {} 
> data and digest responses", blockFor);
>     else
>         logger.debug("Timeout while read-repairing after receiving all {} 
> data and digest responses", blockFor);
>     throw new ReadTimeoutException(consistency, blockFor-1, blockFor, true);
> }
> {code}
> This propagates up and gets sent to the client and we have customers get 
> confused cause they see timeouts for CL ALL requiring ALL replicas even 
> though they have read_repair_chance = 0 and using a LOCAL_* CL.
> At minimum I suggest instead of using the consistency level of DataResolver 
> (which is always ALL with read repairs) for the timeout it instead use 
> repairResults.size(). That is blockFor = repairResults.size() . But saying it 
> received _blockFor - 1_ is bogus still. Fixing that would require more 
> changes. I was thinking maybe like so:
>  
> {code:java}
> public static void waitOnFutures(List<AsyncOneResponse> results, long ms, 
> MutableInt counter) throws TimeoutException
> {
>     for (AsyncOneResponse result : results)
>     {
>         result.get(ms, TimeUnit.MILLISECONDS);
>         counter.increment();
>     }
> }
> {code}
>  
>  
>  
> Likewise in SinglePartitionReadLifecycle:maybeAwaitFullDataRead() it says 
> _blockFor - 1_ for how many were received, which is also bogus.
>  
> Steps used to reproduce was modify RepairMergeListener:close() to always 
> throw timeout exception.  With schema:
> {noformat}
> CREATE KEYSPACE weather WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': '3', 'dc2': '3'}  AND durable_writes = true;
> CREATE TABLE weather.city (
> cityid int PRIMARY KEY,
> name text
> ) WITH bloom_filter_fp_chance = 0.01
> AND dclocal_read_repair_chance = 0.0
> AND read_repair_chance = 0.0
> AND speculative_retry = 'NONE';
> {noformat}
> Then using the following steps:
>  # ccm node1 cqlsh
>  # INSERT INTO weather.city(cityid, name) VALUES (1, 'Canberra');
>  # exit;
>  # ccm node1 flush
>  # ccm node1 stop
>  # rm -rf 
> ~/.ccm/test_repair/node1/data0/weather/city-ff2fade0b18d11e8b1cd097acbab1e3d/mc-1-big-*
>  # remove the sstable with the insert
>  # ccm node1 start
>  # ccm node1 cqlsh
>  # CONSISTENCY LOCAL_QUORUM;
>  # select * from weather.city where cityid = 1;
> You get result of:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 5 
> responses." info={'received_responses': 5, 'required_responses': 6, 
> 'consistency': 'ALL'}{noformat}
> But was expecting:
> {noformat}
> ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting 
> for replica nodes' responses] message="Operation timed out - received only 1 
> responses." info={'received_responses': 1, 'required_responses': 2, 
> 'consistency': 'LOCAL_QUORUM'}{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (CASSANDRA-14715) Read repairs can result in bogus timeout errors to the client

Reply via email to