maedhroz opened a new pull request, #4565:
URL: https://github.com/apache/cassandra/pull/4565

   WIP: Checkpointing progress for full repair support for tracked keyspaces
   This comes with lots of comments and a handful of integration tests, but
   the tests aren't passing and it coul use more unit test coverage.
   There's a refactor (introducing SyncTasks) that's only partially
   complete.
   
   The rest of this commit message is a description of how full repair
   is intended to work for tracked keyspaces.
   
   Tracked keyspaces cannot accept new data without first registering it in
   the log. Any unreconciled data that isn't present in the log will break
   read monotonicity, since mutation tracking uses a single data read and
   can only read reconcile mutation IDs that are present in the log. For
   more information about how bulk transfers work on tracked keyspaces, see
   TrackedImportTransfer.
   
   Full repair sync tasks also deliver data to replicas, and require
   integration with the log just like imports do. For more details on a
   read anomaly that could happen without integration with the bulk
   transfer machinery, see
   TrackedKeyspaceRepairSupportTest#testFullRepairPartiallyCompleteAnomaly.
   
   The general design of this integration is to give repair SyncTasks the
   same two-phase commit as import transfers, where we stream SSTables to a
   pending/ directory, then once sufficient streams complete successfully,
   we "activate" those streams and move them out of the pending directory
   and into the live set.
   
   The first step is to ensure that each SyncTask is aligned to a single
   Mutation Tracking shard, by splitting SyncTasks along the shard
   boundaries. Each SyncTask will then stream data within a single shard,
   and permit us to assign a single transfer ID to each SyncTask.
   
   Each participant in a repair may receive different SyncTasks (or none at
   all, if they're already in-sync). This means that TransferActivation
   needs to be made more flexible, and support a single TransferActivation
   with multiple plan IDs, or no plan IDs at all. This increase in
   flexibility has not yet been implemented.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to