maedhroz opened a new pull request, #4565: URL: https://github.com/apache/cassandra/pull/4565
WIP: Checkpointing progress for full repair support for tracked keyspaces This comes with lots of comments and a handful of integration tests, but the tests aren't passing and it coul use more unit test coverage. There's a refactor (introducing SyncTasks) that's only partially complete. The rest of this commit message is a description of how full repair is intended to work for tracked keyspaces. Tracked keyspaces cannot accept new data without first registering it in the log. Any unreconciled data that isn't present in the log will break read monotonicity, since mutation tracking uses a single data read and can only read reconcile mutation IDs that are present in the log. For more information about how bulk transfers work on tracked keyspaces, see TrackedImportTransfer. Full repair sync tasks also deliver data to replicas, and require integration with the log just like imports do. For more details on a read anomaly that could happen without integration with the bulk transfer machinery, see TrackedKeyspaceRepairSupportTest#testFullRepairPartiallyCompleteAnomaly. The general design of this integration is to give repair SyncTasks the same two-phase commit as import transfers, where we stream SSTables to a pending/ directory, then once sufficient streams complete successfully, we "activate" those streams and move them out of the pending directory and into the live set. The first step is to ensure that each SyncTask is aligned to a single Mutation Tracking shard, by splitting SyncTasks along the shard boundaries. Each SyncTask will then stream data within a single shard, and permit us to assign a single transfer ID to each SyncTask. Each participant in a repair may receive different SyncTasks (or none at all, if they're already in-sync). This means that TransferActivation needs to be made more flexible, and support a single TransferActivation with multiple plan IDs, or no plan IDs at all. This increase in flexibility has not yet been implemented. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

