[
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059773#comment-15059773
]
Sylvain Lebresne commented on CASSANDRA-10726:
----------------------------------------------
The comment in the code is not terribly informative, but the original reason
for this is CASSANDRA-2494. I'll let you read up for precise context, but the
summary is that we wait for read repair to ensure "monotonic quorum reads",
i.e. that if you do 2 successive quorum reads, you're guaranteed the 2nd one
won't see something older than the 1st one, and this even if you had a failed
quorum write that made you write the most up to date value only to a minority
of replicas.
How important that guarantee is in practice is probably debatable, I'm sure
some would be totally fine without it, but we've been providing it (silently)
pretty much forever at this point so some users are likely relying on it (even
if without realizing it). I also more generally think we should try to always
lean towards providing more guarantee rather than less when we can as it yield
a less surprising system. So, without pretending this is the best guarantee
since sliced bread, I'm not terribly enthusiastic at the idea of dropping it.
Which doesn't mean I ignore the problem you're raising. If a node properly
respond to read but not to writes, then it can indeed be a problem for the
reads it is participating in, and that's not great. I'm just not sure dropping
our monotonic quorum read guarantee is the correct way to mitigate that
problem. Part of me feels like a node that is dropping writes consistently
shouldn't be serving read as if everything was fine (a half broken node is
often worth than a fully broken one) but I'm not saying I have a good solution
of the top of my head to ensure that without too much downsides.
I'd certainly welcome a broader range of opinions/ideas ([~jbellis],
[~iamaleksey] in particular?).
> Read repair inserts should not be blocking
> ------------------------------------------
>
> Key: CASSANDRA-10726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
> Project: Cassandra
> Issue Type: Improvement
> Components: Coordination
> Reporter: Richard Low
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert
> to update out of date replicas is blocking. This means, if it fails, the read
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or
> the mutation stage is backed up for some other reason), all reads to a
> replica set could fail. Further, replicas dropping writes get more out of
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not
> be blocking or we should return success for the read even if the write times
> out.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)