Juho Mäkinen created CASSANDRA-10757:
----------------------------------------
Summary: Cluster migration with sstableloader requires significant
compaction time
Key: CASSANDRA-10757
URL: https://issues.apache.org/jira/browse/CASSANDRA-10757
Project: Cassandra
Issue Type: Improvement
Components: Compaction, Streaming and Messaging
Reporter: Juho Mäkinen
Priority: Minor
Fix For: 2.1.11
When sstableloader is used to migrate data from a cluster into another the
loading creates a lot more data and a lot more sstable files than what the
original cluster had.
For example in my case a 62 node with 16 TiB of data in 80000 sstables was
sstableloaded into another cluster with 36 nodes and this resulted with 42 TiB
of used data in a whopping 350000 sstables.
The sstableloadering process itself was relatively fast (around 8 hours), but
in the result the destination cluster needs approximately two weeks of
compaction (these are C4.4xlarge instances, 16 cores each, compaction running
on 14 cores, 9000 iops, 250 MiB/sec sustained disk bandwidth.)
Could sstableloader process somehow improved to make this kind of migrations
less painful and faster?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)