dcapwell commented on code in PR #3656:
URL: https://github.com/apache/cassandra/pull/3656#discussion_r1828425002
##########
src/java/org/apache/cassandra/service/accord/AccordService.java:
##########
@@ -1225,15 +1281,36 @@ public void tryMarkRemoved(Topology topology, Id target)
if (node.commandStores().count() == 0) return; // when starting up
stores can be empty, so ignore
Ranges ranges = topology.rangesForNode(target);
if (ranges.isEmpty()) return;
- tryMarkRemoved(ranges, 0).begin(node().agent());
+ long startNanos = Clock.Global.nanoTime();
+ exclusiveSyncPointWithRetries(ranges, 0)
+ .flatMap(sp -> shardDurabilityWithRetries(sp, 0))
Review Comment:
> 7) when nodes leave the cluster we did not start durability sync (this
isn’t working, but thats a different issue… durability sync requires ALL which
isn’t possible)
This is failing 100% of the time, but thats a bug in durability
scheduling... it requires `ALL`, which isn't possible as we *removed* a node...
it also means downed nodes block this logic as well... spoke with Benedict and
deferring fixing the durability issue for now, just enabling the attempts so
when its fixed this logic works fine
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]