Ariel Weisberg created CASSANDRA-13327:
------------------------------------------
Summary: Pending endpoints size check for CAS doesn't play nicely
with writes-on-replacement
Key: CASSANDRA-13327
URL: https://issues.apache.org/jira/browse/CASSANDRA-13327
Project: Cassandra
Issue Type: Bug
Components: Coordination
Reporter: Ariel Weisberg
Assignee: Ariel Weisberg
Consider this ring:
127.0.0.1 MR UP JOINING -7301836195843364181
127.0.0.2 MR UP NORMAL -7263405479023135948
127.0.0.3 MR UP NORMAL -7205759403792793599
127.0.0.4 MR DOWN NORMAL -7148113328562451251
where 127.0.0.1 was bootstrapping for cluster expansion. Note that, due to the
failure of 127.0.0.4, 127.0.0.1 was stuck trying to stream from it and making
no progress.
Then the down node was replaced so we had:
127.0.0.1 MR UP JOINING -7301836195843364181
127.0.0.2 MR UP NORMAL -7263405479023135948
127.0.0.3 MR UP NORMAL -7205759403792793599
127.0.0.5 MR UP JOINING -7148113328562451251
It’s confusing in the ring - the first JOINING is a genuine bootstrap, the
second is a replacement. We now had CAS unavailables (but no non-CAS
unvailables). I think it’s because the pending endpoints check thinks that
127.0.0.5 is gaining a range when it’s just replacing.
The workaround is to kill the stuck JOINING node, but Cassandra shouldn’t
unnecessarily fail these requests.
It also appears like required participants is bumped by 1 during a host
replacement so if the replacing host fails you will get unavailables and
timeouts.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)