[
https://issues.apache.org/jira/browse/CASSANDRA-16315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448304#comment-17448304
]
Yury Vidineev commented on CASSANDRA-16315:
-------------------------------------------
+1. I've seen many times situations when increasing concurrent_compactors
actually decrease the total compaction speed, and disk IO wasn't the bottleneck
> Remove bad advice on concurrent compactors from cassandra.yaml
> --------------------------------------------------------------
>
> Key: CASSANDRA-16315
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16315
> Project: Cassandra
> Issue Type: Improvement
> Components: Local/Config
> Reporter: Jeremy Hanna
> Priority: Normal
>
> Since CASSANDRA-7551, we gave the following advice for setting
> {{concurrent_compactors}}:
> {code}
> # If your data directories are backed by SSD, you should increase this
> # to the number of cores.
> {code}
> However in practice there are a number of problems with this. While it's
> true that one can increase {{concurrent_compactors}} to improve efficiency of
> compactions on machines with more cpu cores, the context switching with
> random IO and GC associated with bringing compaction data into the heap will
> work against the additional parallelism.
> This has caused problems for those who have taken this advice literally.
> I propose that we adjust this language to give a limit on number of
> {{concurrent_compactors}} for this setting both in the 3.x line and in trunk
> so that new users do not stumble when reviewing whether to change defaults.
> See also CASSANDRA-7139 for a discussion on considerations.
> I see two short-term options to avoid new user pain:
> 1. Change the language to say something like this:
> {quote}
> When using SSD based storage, you can increase the number of
> {{concurrent_compactors}}. However be aware that using too many concurrent
> compactors can have a detrimental effect such as GC pressure, more context
> switching among compactors and realtime operations, and more random IO
> pulling data for different compactions. It's best to test and measure with
> your workload and hardware.
> {quote}
> 2. Do some significant testing of compaction efficient and read/write
> latency/throughput targets to see where the tipping point is - considering
> some constants around memory and heap size and configuration to keep it
> simple.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]