Jeff Jirsa commented on CASSANDRA-14210:

[~krummas] - you're probably best equipped to review, any chance you're 

> Optimize SSTables upgrade task scheduling
> -----------------------------------------
>                 Key: CASSANDRA-14210
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14210
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Compaction
>            Reporter: Oleksandr Shulgin
>            Assignee: Kurt Greaves
>            Priority: Major
>             Fix For: 4.x
> When starting the SSTable-rewrite process by running {{nodetool 
> upgradesstables --jobs N}}, with N > 1, not all of the provided N slots are 
> used.
> For example, we were testing with {{concurrent_compactors=5}} and {{N=4}}.  
> What we observed both for version 2.2 and 3.0, is that initially all 4 
> provided slots are used for "Upgrade sstables" compactions, but later when 
> some of the 4 tasks are finished, no new tasks are scheduled immediately.  It 
> takes the last of the 4 tasks to finish before new 4 tasks would be 
> scheduled.  This happens on every node we've observed.
> This doesn't utilize available resources to the full extent allowed by the 
> --jobs N parameter.  In the field, on a cluster of 12 nodes with 4-5 TiB data 
> each, we've seen that the whole process was taking more than 7 days, instead 
> of estimated 1.5-2 days (provided there would be close to full N slots 
> utilization).
> Instead, new tasks should be scheduled as soon as there is a free compaction 
> slot.
> Additionally, starting from the biggest SSTables could further reduce the 
> total time required for the whole process to finish on any given node.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to