For folks who are using or considering using cassandra in their production systems, what do you use for backups?
With HBase, one could potentially write a mapreduce to perform a row scan of the entire table (restricted to some historical timestamp to get a consistent view) and export the data to hdfs. With Cassandra, if you're using an ordered partitioner, a similar mechanism could be built over a key range scan. With a random partitioner, though, there's no api to iterate through all existing keys. Why not? Edmond
