Juho Mäkinen created CASSANDRA-10757: ----------------------------------------
Summary: Cluster migration with sstableloader requires significant compaction time Key: CASSANDRA-10757 URL: https://issues.apache.org/jira/browse/CASSANDRA-10757 Project: Cassandra Issue Type: Improvement Components: Compaction, Streaming and Messaging Reporter: Juho Mäkinen Priority: Minor Fix For: 2.1.11 When sstableloader is used to migrate data from a cluster into another the loading creates a lot more data and a lot more sstable files than what the original cluster had. For example in my case a 62 node with 16 TiB of data in 80000 sstables was sstableloaded into another cluster with 36 nodes and this resulted with 42 TiB of used data in a whopping 350000 sstables. The sstableloadering process itself was relatively fast (around 8 hours), but in the result the destination cluster needs approximately two weeks of compaction (these are C4.4xlarge instances, 16 cores each, compaction running on 14 cores, 9000 iops, 250 MiB/sec sustained disk bandwidth.) Could sstableloader process somehow improved to make this kind of migrations less painful and faster? -- This message was sent by Atlassian JIRA (v6.3.4#6332)