[ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059773#comment-15059773
 ] 

Sylvain Lebresne commented on CASSANDRA-10726:
----------------------------------------------

The comment in the code is not terribly informative, but the original reason 
for this is CASSANDRA-2494. I'll let you read up for precise context, but the 
summary is that we wait for read repair to ensure "monotonic quorum reads", 
i.e. that if you do 2 successive quorum reads, you're guaranteed the 2nd one 
won't see something older than the 1st one, and this even if you had a failed 
quorum write that made you write the most up to date value only to a minority 
of replicas.

How important that guarantee is in practice is probably debatable, I'm sure 
some would be totally fine without it, but we've been providing it (silently) 
pretty much forever at this point so some users are likely relying on it (even 
if without realizing it). I also more generally think we should try to always 
lean towards providing more guarantee rather than less when we can as it yield 
a less surprising system. So, without pretending this is the best guarantee 
since sliced bread, I'm not terribly enthusiastic at the idea of dropping it.

Which doesn't mean I ignore the problem you're raising. If a node properly 
respond to read but not to writes, then it can indeed be a problem for the 
reads it is participating in, and that's not great. I'm just not sure dropping 
our monotonic quorum read guarantee is the correct way to mitigate that 
problem. Part of me feels like a node that is dropping writes consistently 
shouldn't be serving read as if everything was fine (a half broken node is 
often worth than a fully broken one) but I'm not saying I have a good solution 
of the top of my head to ensure that without too much downsides.

I'd certainly welcome a broader range of opinions/ideas ([~jbellis], 
[~iamaleksey] in particular?).

> Read repair inserts should not be blocking
> ------------------------------------------
>
>                 Key: CASSANDRA-10726
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Coordination
>            Reporter: Richard Low
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to