dcapwell commented on code in PR #103:
URL: https://github.com/apache/cassandra-accord/pull/103#discussion_r1778967860


##########
accord-core/src/main/java/accord/topology/TopologyManager.java:
##########
@@ -98,36 +97,33 @@ static class EpochState
 
             this.addedRanges = 
global.ranges.subtract(prevRanges).mergeTouching();
             this.removedRanges = 
prevRanges.mergeTouching().subtract(global.ranges);
-            this.prevSyncComplete = addedRanges.union(MERGE_ADJACENT, 
prevSyncComplete.subtract(removedRanges));
-            this.curSyncComplete = this.syncComplete = addedRanges;
+            this.syncComplete = addedRanges;

Review Comment:
   `syncComplete` is based off `accord.local.Node#onTopologyUpdateInternal`  
and `accord.api.ConfigurationService.EpochReady#coordination`
   
   ```
   ready.coordination.addCallback(() ->  {
               synchronized (AbstractConfigurationService.this)
               {
                   localSyncComplete(epochs.getOrCreate(ready.epoch).topology, 
startSync);
               }
           });
   ```
   
   This logic was changed (we talked about this a long time ago) so that each 
epoch is now sequential and we don't allow epoch=N to happen-before epoch=N-1 
(currently N can be syncComplete before N-1, simple case is that N-1 is a node 
joining and must stream, and N is a table comment update).
   
   If you look at `accord.local.Node#onTopologyUpdateInternal` coordination is 
now sequenced so `N` happens-after `N-1`.
   
   In Apache Cassandra this is notifies `localSyncComplete` which marks the 
epoch as `epochState.setSyncStatus(SyncStatus.NOTIFYING);` and notifies the 
*nodes in the topology* about the epoch.  Those nodes will do the same which 
cause `accord.topology.TopologyManager#onEpochSyncComplete` to get called, 
which updates `syncComplete` (and `prevSyncComplete` and `curSyncComplete`)...
   
   Now we have to turn to 2 cases that exist in Apache Cassandra: Expand 
Cluster, and Host Replacement.  In both of these cases a *new node* starts to 
join and will know about past epochs (as it's needed for processing), but won't 
be present in those past epochs.  This leads us to a situation where those past 
epochs have `syncComplete = []` and we include every single epoch for 
coordination.
   
   By sequencing `EpochReady.coordination`, once this new node *sees* epoch=N 
be sync complete by its peers, it *knows* epoch=N-1 is also syncComplete.
   
   Without this change, we would require every node in the cluster to notify 
the joining node about every epoch they know and their sync status... if the 
sync is also not complete then the node would be stuck with a partial update as 
it wouldn't learn about new ones!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to