dcapwell commented on code in PR #3416:
URL: https://github.com/apache/cassandra/pull/3416#discussion_r1777753226
##########
src/java/org/apache/cassandra/service/accord/AccordService.java:
##########
@@ -437,12 +557,14 @@ public IVerbHandler<? extends Request> verbHandler()
if (cause instanceof Timeout)
{
TxnId txnId = ((Timeout) cause).txnId();
+ ((AccordAgent) node.agent()).onFailedBarrier(txnId,
keysOrRanges, cause);
Review Comment:
we kept having issues with barriers, and the root cause was the following:
1) Cassandra does not do durability scheduling (last I heard its still
broken)
2) sync points don't use the same epoch discovery logic as normal txn, they
rely on `ExclusiveSyncPoint`s, so without those running we kinda include every
single epoch when a barrier runs... and when host replacements happen older
epochs are no longer safe to touch (shards can't reach quorum)... this has
been improved in this patch, but these hooks are here for tests to provide
useful error msg
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]