dcapwell commented on code in PR #3416:
URL: https://github.com/apache/cassandra/pull/3416#discussion_r1777753226


##########
src/java/org/apache/cassandra/service/accord/AccordService.java:
##########
@@ -437,12 +557,14 @@ public IVerbHandler<? extends Request> verbHandler()
             if (cause instanceof Timeout)
             {
                 TxnId txnId = ((Timeout) cause).txnId();
+                ((AccordAgent) node.agent()).onFailedBarrier(txnId, 
keysOrRanges, cause);

Review Comment:
   we kept having issues with barriers, and the root cause was the following:
   
   1) Cassandra does not do durability scheduling (last I heard its still 
broken)
   2) sync points don't use the same epoch discovery logic as normal txn, they 
rely on `ExclusiveSyncPoint`s, so without those running we kinda include every 
single epoch when a barrier runs... and when host replacements happen older 
epochs are no longer safe to touch (shards can't reach quorum)...  this has 
been improved in this patch, but these hooks are here for tests to provide 
useful error msg 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to