Aleksey Yeschenko created CASSANDRA-21407:
---------------------------------------------
Summary: CEP-45: Filter minority writes from sstables on
recovering replicas after partial shard sealing
Key: CASSANDRA-21407
URL: https://issues.apache.org/jira/browse/CASSANDRA-21407
Project: Apache Cassandra
Issue Type: New Feature
Reporter: Aleksey Yeschenko
When we partially seal a shard, a replica that was down during the
sealing will need to filter out, from its sstables, the minority writes it had
belonging to the sealed shard when it comes back online.
Fully reconciled (and marked as repaired) sstables should remain as they are.
Partially reconciled (unrepaired) sstables, if they contain any minority
mutations, need to be dropped and rebuilt from the mutation journal. The
replacing sstable will need to include all the fully reconciled mutations,
plus any unreconciled mutations that are safe to keep (i.e., not minority
mutations outside of the sealed mutation ID sets).
CASSANDRA-21406 handles the changes to mutation journal segment dropping
that are a prerequisite for this JIRA ticket. This ticket is focused on
the implementation of sstable filtering logic.
When a node comes back up (but before it starts serving reads again), it
will catch up on TCM changes and also query its peers for the up-to-date
mutation tracking state for the relevant shards, including partially
sealed shards that were sealed while this replica was unavailable.
With the knowledge of all the partially sealed shards, which includes
the final, sealed set of mutation IDs, the unrepaired sstables on this
replica now potentially need to be filtered/overwritten.
We would check StatsMetadata of each unrepaired sstable for the presence
of log IDs involved in any partially sealed shards. Then we'll check
if any mutation IDs in those logs are not present in the sealed mutation
ID set. These are the minority writes, and an sstable with any one of these
will need to be recreated from the journal. All affected sstables will be
rebuilt, one by one, using the IDs from StatsMetadata minus the minority
writes.
P.S. I think there is no need to similarly filter out the mutation segments,
as that will just happen naturally using the logic from CASSANDRA-21406 as
a consequence of sstables being filtered.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]