Re: [PR] CASSANDRA-19769: CEP-15: (Accord) sequence EpochReady.coordinating to allow syncComplete to be learned from newer epochs [cassandra-accord]

via GitHub Fri, 27 Sep 2024 11:33:37 -0700


dcapwell commented on code in PR #103:
URL: https://github.com/apache/cassandra-accord/pull/103#discussion_r1778967860



##########
accord-core/src/main/java/accord/topology/TopologyManager.java:
##########
@@ -98,36 +97,33 @@ static class EpochState
 
             this.addedRanges = 
global.ranges.subtract(prevRanges).mergeTouching();
             this.removedRanges = 
prevRanges.mergeTouching().subtract(global.ranges);
-            this.prevSyncComplete = addedRanges.union(MERGE_ADJACENT, 
prevSyncComplete.subtract(removedRanges));
-            this.curSyncComplete = this.syncComplete = addedRanges;
+            this.syncComplete = addedRanges;

Review Comment:
   `syncComplete` is based off `accord.local.Node#onTopologyUpdateInternal`  
and `accord.api.ConfigurationService.EpochReady#coordination`
   
   ```
   ready.coordination.addCallback(() ->  {
               synchronized (AbstractConfigurationService.this)
               {
                   localSyncComplete(epochs.getOrCreate(ready.epoch).topology, 
startSync);
               }
           });
   ```
   
   This logic was changed (we talked about this a long time ago) so that each 
epoch is now sequential and we don't allow epoch=N to happen-before epoch=N-1 
(currently N can be syncComplete before N-1, simple case is that N-1 is a node 
joining and must stream, and N is a table comment update).
   
   If you look at `accord.local.Node#onTopologyUpdateInternal` coordination is 
now sequenced so `N` happens-after `N-1`.
   
   In Apache Cassandra this is notifies `localSyncComplete` which marks the 
epoch as `epochState.setSyncStatus(SyncStatus.NOTIFYING);` and notifies the 
*nodes in the topology* about the epoch.  Those nodes will do the same which 
cause `accord.topology.TopologyManager#onEpochSyncComplete` to get called, 
which updates `syncComplete` (and `prevSyncComplete` and `curSyncComplete`)...
   
   Now we have 2 cases that exist in Apache Cassandra: Expand Cluster, and Host 
Replacement.  In both of these cases a *new node* starts to join and will know 
about past epochs (as it's needed for processing), but won't be present in 
those past epochs.  This leads us to a situation where those past epochs have 
`syncComplete = []` and we include every single epoch for coordination.
   
   By sequencing `EpochReady.coordination`, once this new node *sees* epoch=N 
be sync complete by its peers, it *knows* epoch=N-1 is also syncComplete.
   
   Without this change, we would require every node in the cluster to notify 
the joining node about every epoch they know and their sync status... if the 
sync is also not complete then the node would be stuck with a partial update as 
it wouldn't learn about new ones!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] CASSANDRA-19769: CEP-15: (Accord) sequence EpochReady.coordinating to allow syncComplete to be learned from newer epochs [cassandra-accord]

Reply via email to