[
https://issues.apache.org/jira/browse/CASSANDRA-10960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081833#comment-15081833
]
Anubhav Kale commented on CASSANDRA-10960:
------------------------------------------
Here is a scenario:
Time t1: KS/CF/s1.db s2.db KS/CF/backups/s1.db s2.db
Time t2: KS/CF/s1.db s2.db s3.db KS/CF/backups/s1.db s2.db s3.db [Since anytime
SS Table is flushed its written to backups as well]
Time t3 (Compaction ran): KS/CF/s4.db KS/CF/backups/s1.db s2.db s3.db s4.db
This is existing behavior - correct ? The data hasn't changed in here, its
simply represented via s4. It is reasonable to keep s1,s2,s3,s4 in backups so
that folks can go back to any point in time. However, if folks want to move
data from backups to elsewhere outside C* and copy it back during recovery --
it adds unnecessary burden of copying the same data multiple times (copying
back s4 should have been enough here for recovery).
Does this make sense ? Please let me know if I did not understand something
correctly here.
> Compaction should delete old files from incremental backups folder
> ------------------------------------------------------------------
>
> Key: CASSANDRA-10960
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10960
> Project: Cassandra
> Issue Type: Improvement
> Components: Compaction
> Environment: PROD
> Reporter: Anubhav Kale
> Priority: Minor
>
> When compaction runs the old flushed SS Tables from backups folder are not
> deleted. If folks need to move the backups folder somewhere outside the
> cluster, recovery becomes slower because unnecessary files need to be copied
> back.
> Is this behavior by design ?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)