efficiently generate complete database dump in text format

2014-10-09 Thread Gaurav Bhatnagar
Hi, We have a Cassandra database column family containing 320 millions rows and each row contains about 15 columns. We want to take monthly dump of this single column family contained in this database in text format. We are planning to take following approach to implement this functionality 1.

Re: efficiently generate complete database dump in text format

2014-10-09 Thread Paulo Ricardo Motta Gomes
The best way to generate dumps from Cassandra is via Hadoop integration (or spark). You can find more info here: http://www.datastax.com/documentation/cassandra/2.1/cassandra/configuration/configHadoop.html http://wiki.apache.org/cassandra/HadoopSupport On Thu, Oct 9, 2014 at 4:19 AM, Gaurav

Re: efficiently generate complete database dump in text format

2014-10-09 Thread Daniel Chia
You might also want to consider tools like https://github.com/Netflix/aegisthus for the last step, which can help you deal with tombstones and de-duplicate data. Thanks, Daniel On Thu, Oct 9, 2014 at 12:19 AM, Gaurav Bhatnagar gbhatna...@gmail.com wrote: Hi, We have a Cassandra database