Juho Mäkinen created CASSANDRA-10757:
----------------------------------------

             Summary: Cluster migration with sstableloader requires significant 
compaction time
                 Key: CASSANDRA-10757
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10757
             Project: Cassandra
          Issue Type: Improvement
          Components: Compaction, Streaming and Messaging
            Reporter: Juho Mäkinen
            Priority: Minor
             Fix For: 2.1.11


When sstableloader is used to migrate data from a cluster into another the 
loading creates a lot more data and a lot more sstable files than what the 
original cluster had.

For example in my case a 62 node with 16 TiB of data in 80000 sstables was 
sstableloaded into another cluster with 36 nodes and this resulted with 42 TiB 
of used data in a whopping 350000 sstables.

The sstableloadering process itself was relatively fast (around 8 hours), but 
in the result the destination cluster needs approximately two weeks of 
compaction (these are C4.4xlarge instances, 16 cores each, compaction running 
on 14 cores, 9000 iops, 250 MiB/sec sustained disk bandwidth.)

Could sstableloader process somehow improved to make this kind of migrations 
less painful and faster?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to