[jira] [Commented] (CASSANDRA-10960) Compaction should delete old files from incremental backups folder
[ https://issues.apache.org/jira/browse/CASSANDRA-10960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081772#comment-15081772 ] Carl Yeksigian commented on CASSANDRA-10960: We can't delete the backups because we don't know where the backup process is. Also, since compactions don't just combine new sstables that are in the backups folder with each other, if we used the newly compacted SSTables, we would be including data that has already been backed up, so they wouldn't be incremental backup files. > Compaction should delete old files from incremental backups folder > -- > > Key: CASSANDRA-10960 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10960 > Project: Cassandra > Issue Type: Improvement > Components: Compaction > Environment: PROD >Reporter: Anubhav Kale >Priority: Minor > > When compaction runs the old flushed SS Tables from backups folder are not > deleted. If folks need to move the backups folder somewhere outside the > cluster, recovery becomes slower because unnecessary files need to be copied > back. > Is this behavior by design ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10960) Compaction should delete old files from incremental backups folder
[ https://issues.apache.org/jira/browse/CASSANDRA-10960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081741#comment-15081741 ] Anubhav Kale commented on CASSANDRA-10960: -- This is not about manually deleting old backup folders (that's okay). This is about C* not deleting the files from backups when those were deleted as part of compaction. Why is that by design -- can you please elaborate ? > Compaction should delete old files from incremental backups folder > -- > > Key: CASSANDRA-10960 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10960 > Project: Cassandra > Issue Type: Improvement > Components: Compaction > Environment: PROD >Reporter: Anubhav Kale >Priority: Minor > > When compaction runs the old flushed SS Tables from backups folder are not > deleted. If folks need to move the backups folder somewhere outside the > cluster, recovery becomes slower because unnecessary files need to be copied > back. > Is this behavior by design ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10960) Compaction should delete old files from incremental backups folder
[ https://issues.apache.org/jira/browse/CASSANDRA-10960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081833#comment-15081833 ] Anubhav Kale commented on CASSANDRA-10960: -- Here is a scenario: Time t1: KS/CF/s1.db s2.db KS/CF/backups/s1.db s2.db Time t2: KS/CF/s1.db s2.db s3.db KS/CF/backups/s1.db s2.db s3.db [Since anytime SS Table is flushed its written to backups as well] Time t3 (Compaction ran): KS/CF/s4.db KS/CF/backups/s1.db s2.db s3.db s4.db This is existing behavior - correct ? The data hasn't changed in here, its simply represented via s4. It is reasonable to keep s1,s2,s3,s4 in backups so that folks can go back to any point in time. However, if folks want to move data from backups to elsewhere outside C* and copy it back during recovery -- it adds unnecessary burden of copying the same data multiple times (copying back s4 should have been enough here for recovery). Does this make sense ? Please let me know if I did not understand something correctly here. > Compaction should delete old files from incremental backups folder > -- > > Key: CASSANDRA-10960 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10960 > Project: Cassandra > Issue Type: Improvement > Components: Compaction > Environment: PROD >Reporter: Anubhav Kale >Priority: Minor > > When compaction runs the old flushed SS Tables from backups folder are not > deleted. If folks need to move the backups folder somewhere outside the > cluster, recovery becomes slower because unnecessary files need to be copied > back. > Is this behavior by design ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10960) Compaction should delete old files from incremental backups folder
[ https://issues.apache.org/jira/browse/CASSANDRA-10960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082020#comment-15082020 ] Anubhav Kale commented on CASSANDRA-10960: -- Thanks for the explanation. While I don't want to continue the conversation here, IMHO C* need to enable a behavior where "old" ss tables from backups are deleted whenever they are deleted as part of compaction from actual folders. Else, too much duplicate data has to be moved back to nodes at the time of recovery. Specific scenario is when backups need to be moved outside of Cassandra, else current behavior is good enough. > Compaction should delete old files from incremental backups folder > -- > > Key: CASSANDRA-10960 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10960 > Project: Cassandra > Issue Type: Improvement > Components: Compaction > Environment: PROD >Reporter: Anubhav Kale >Priority: Minor > > When compaction runs the old flushed SS Tables from backups folder are not > deleted. If folks need to move the backups folder somewhere outside the > cluster, recovery becomes slower because unnecessary files need to be copied > back. > Is this behavior by design ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)