[jira] [Commented] (CASSANDRA-10960) Compaction should delete old files from incremental backups folder

2016-01-04 Thread Carl Yeksigian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081772#comment-15081772
 ] 

Carl Yeksigian commented on CASSANDRA-10960:


We can't delete the backups because we don't know where the backup process is. 
Also, since compactions don't just combine new sstables that are in the backups 
folder with each other, if we used the newly compacted SSTables, we would be 
including data that has already been backed up, so they wouldn't be incremental 
backup files.

> Compaction should delete old files from incremental backups folder
> --
>
> Key: CASSANDRA-10960
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10960
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
> Environment: PROD
>Reporter: Anubhav Kale
>Priority: Minor
>
> When compaction runs the old flushed SS Tables from backups folder are not 
> deleted. If folks need to move the backups folder somewhere outside the 
> cluster, recovery becomes slower because unnecessary files need to be copied 
> back. 
> Is this behavior by design ? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10960) Compaction should delete old files from incremental backups folder

2016-01-04 Thread Anubhav Kale (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081741#comment-15081741
 ] 

Anubhav Kale commented on CASSANDRA-10960:
--

This is not about manually deleting old backup folders (that's okay). This is 
about C* not deleting the files from backups when those were deleted as part of 
compaction. Why is that by design -- can you please elaborate ?

> Compaction should delete old files from incremental backups folder
> --
>
> Key: CASSANDRA-10960
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10960
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
> Environment: PROD
>Reporter: Anubhav Kale
>Priority: Minor
>
> When compaction runs the old flushed SS Tables from backups folder are not 
> deleted. If folks need to move the backups folder somewhere outside the 
> cluster, recovery becomes slower because unnecessary files need to be copied 
> back. 
> Is this behavior by design ? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10960) Compaction should delete old files from incremental backups folder

2016-01-04 Thread Anubhav Kale (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081833#comment-15081833
 ] 

Anubhav Kale commented on CASSANDRA-10960:
--

Here is a scenario:

Time t1: KS/CF/s1.db s2.db KS/CF/backups/s1.db s2.db
Time t2: KS/CF/s1.db s2.db s3.db KS/CF/backups/s1.db s2.db s3.db [Since anytime 
SS Table is flushed its written to backups as well]
Time t3 (Compaction ran): KS/CF/s4.db KS/CF/backups/s1.db s2.db s3.db s4.db 

This is existing behavior - correct ? The data hasn't changed in here, its 
simply represented via s4. It is reasonable to keep s1,s2,s3,s4 in backups so 
that folks can go back to any point in time. However, if folks want to move 
data from backups to elsewhere outside C* and copy it back during recovery -- 
it adds unnecessary burden of copying the same data multiple times (copying 
back s4 should have been enough here for recovery). 

Does this make sense ? Please let me know if I did not understand something 
correctly here.

> Compaction should delete old files from incremental backups folder
> --
>
> Key: CASSANDRA-10960
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10960
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
> Environment: PROD
>Reporter: Anubhav Kale
>Priority: Minor
>
> When compaction runs the old flushed SS Tables from backups folder are not 
> deleted. If folks need to move the backups folder somewhere outside the 
> cluster, recovery becomes slower because unnecessary files need to be copied 
> back. 
> Is this behavior by design ? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10960) Compaction should delete old files from incremental backups folder

2016-01-04 Thread Anubhav Kale (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082020#comment-15082020
 ] 

Anubhav Kale commented on CASSANDRA-10960:
--

Thanks for the explanation. While I don't want to continue the conversation 
here, IMHO C* need to enable a behavior where "old" ss tables from backups are 
deleted whenever they are deleted as part of compaction from actual folders. 
Else, too much duplicate data has to be moved back to nodes at the time of 
recovery.
Specific scenario is when backups need to be moved outside of Cassandra, else 
current behavior is good enough.

> Compaction should delete old files from incremental backups folder
> --
>
> Key: CASSANDRA-10960
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10960
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
> Environment: PROD
>Reporter: Anubhav Kale
>Priority: Minor
>
> When compaction runs the old flushed SS Tables from backups folder are not 
> deleted. If folks need to move the backups folder somewhere outside the 
> cluster, recovery becomes slower because unnecessary files need to be copied 
> back. 
> Is this behavior by design ? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)