[
https://issues.apache.org/jira/browse/CASSANDRA-11684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Paulo Motta updated CASSANDRA-11684:
------------------------------------
Labels: (was: gsoc2021 mentor)
> Cleanup key ranges during compaction
> ------------------------------------
>
> Key: CASSANDRA-11684
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11684
> Project: Cassandra
> Issue Type: Improvement
> Components: Local/Compaction
> Reporter: Stefan Podkowinski
> Priority: Normal
>
> Currently cleanup is considered an optional, manual operation that users are
> told to run to free disk space after a node was affected by topology changes.
> However, unmanaged key ranges could also end up on a node through other ways,
> e.g. manual added sstable files by an admin.
> I'm also not sure unmanaged data is really that harmless and cleanup should
> really be optional, if you don't need to reclaim the disk space. When it
> comes to repairs, users are expected to purge a node after downtime in case
> it was not fully covered by a repair within gc_grace afterwards, in order to
> avoid re-introducing deleted data. But the same could happen with unmanaged
> data, e.g. after topology changes activate unmanaged ranges again or after
> restoring backups.
> I'd therefor suggest to avoid rewriting key ranges no longer belonging to a
> node and older than gc_grace during compactions.
> Maybe we could also introduce another CLEANUP_COMPACTION operation to find
> candidates based on SSTable.first/last in case we don't have pending regular
> or tombstone compactions.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]