I have an ideal for backups in my mind with Cassandra to dump each columnfamily 
to a directory and use an offline process to compact them all into one sstable 
(or max sstable size set). I have an ideal for restoration which involves a 
streaming read an sstable set and output based on whether the data fits within 
a token range. The result of this is that I can store a single copy of data 
that is effectively already repaired and can read from the specific range that 
covers a node that I wish to restore. My first look at this was somewhat 
frustrated by sstable code in the current versions have a strong reliance on 
the system keyspace.

Does anybody have any thoughts in regards to other things that might exist and 
fulfill this (particularly offline collective compaction), have a desire for 
such tools, or have any useful information for me before I attempt to build 
such beasts?

-Jeff

Reply via email to