> > > Kurt - We're on 3.7, and our approach was to try thorttling compaction > throughput as much as possible rather than the opposite. I had found some > resources that suggested unthrottling to let it get it over with, but > wasn't sure if this would really help in our situation since the I/O pipe > was already fully saturated. >
You should unthrottle during bootstrap as the node won't receive read queries until it finishes streaming and joins the cluster. It seems unlikely that you'd be bottlenecked on I/O during the bootstrapping process. If you were, you'd certainly have bigger problems. The aim is to clear out the majority of compactions *before* the node joins and starts servicing reads. You might also want to increase concurrent_compactors. Typical advice is same as # CPU cores, but you might want to increase it for the bootstrapping period. sstableofflinerelevel could help but I wouldn't count on it. Usage is pretty straightforward but you may find that a lot of the existing SSTables in L0 just get put back in L0 anyways, which is where the main compaction backlog comes from. Plus you have to take the node offline which may not be ideal. In this case I would suggest the strategy Lerh suggested as being more viable. Regardless, if the rest of your nodes are OK (and you don't have RF1/using CL=ALL) Cassandra should pretty effectively route around the slow node so a single node backed up on compactions shouldn't be a big deal.