[ 
https://issues.apache.org/jira/browse/CASSANDRA-10757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10757:
------------------------------------------
    Fix Version/s:     (was: 2.1.11)

> Cluster migration with sstableloader requires significant compaction time
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-10757
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10757
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Compaction, Streaming and Messaging
>            Reporter: Juho Mäkinen
>            Priority: Minor
>              Labels: sstableloader
>
> When sstableloader is used to migrate data from a cluster into another the 
> loading creates a lot more data and a lot more sstable files than what the 
> original cluster had.
> For example in my case a 62 node with 16 TiB of data in 80000 sstables was 
> sstableloaded into another cluster with 36 nodes and this resulted with 42 
> TiB of used data in a whopping 350000 sstables.
> The sstableloadering process itself was relatively fast (around 8 hours), but 
> in the result the destination cluster needs approximately two weeks of 
> compaction to be able to reduce the number of sstables back to the original 
> levels. (The instances are C4.4xlarge in EC2, 16 cores each, compaction 
> running on 14 cores. the EBS disks in each instance provide 9000 iops and max 
> 250 MiB/sec disk bandwidth.).
> Could sstableloader process somehow improved to make this kind of migrations 
> less painful and faster?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to