[ 
https://issues.apache.org/jira/browse/CASSANDRA-16315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448304#comment-17448304
 ] 

Yury Vidineev commented on CASSANDRA-16315:
-------------------------------------------

+1. I've seen many times situations when increasing concurrent_compactors  
actually decrease the total compaction speed, and disk IO wasn't the bottleneck

> Remove bad advice on concurrent compactors from cassandra.yaml
> --------------------------------------------------------------
>
>                 Key: CASSANDRA-16315
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16315
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/Config
>            Reporter: Jeremy Hanna
>            Priority: Normal
>
> Since CASSANDRA-7551, we gave the following advice for setting 
> {{concurrent_compactors}}:
> {code}
> # If your data directories are backed by SSD, you should increase this
> # to the number of cores.
> {code}
> However in practice there are a number of problems with this.  While it's 
> true that one can increase {{concurrent_compactors}} to improve efficiency of 
> compactions on machines with more cpu cores, the context switching with 
> random IO and GC associated with bringing compaction data into the heap will 
> work against the additional parallelism.
> This has caused problems for those who have taken this advice literally.
> I propose that we adjust this language to give a limit on number of 
> {{concurrent_compactors}} for this setting both in the 3.x line and in trunk 
> so that new users do not stumble when reviewing whether to change defaults.
> See also CASSANDRA-7139 for a discussion on considerations.
> I see two short-term options to avoid new user pain:
> 1. Change the language to say something like this:
> {quote}
> When using SSD based storage, you can increase the number of 
> {{concurrent_compactors}}.  However be aware that using too many concurrent 
> compactors can have a detrimental effect such as GC pressure, more context 
> switching among compactors and realtime operations, and more random IO 
> pulling data for different compactions.  It's best to test and measure with 
> your workload and hardware.
> {quote}
> 2. Do some significant testing of compaction efficient and read/write 
> latency/throughput targets to see where the tipping point is - considering 
> some constants around memory and heap size and configuration to keep it 
> simple.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to