Re: compaction throughput

PenguinWhispererThe . Thu, 21 Jan 2016 05:29:26 -0800

Thanks for that clarification Sebastian! That's really good to know! I
never took increasing this value in consideration because of my previous
experience.


In my case I had a table that was compacting over and over... and only one
CPU was used. So that made me believe it was not multithreaded (I actually
believe I asked this on IRC however it's been a few months ago so I might
be wrong).

Have there been behavioral changes on this lately? (I was using 2.0.9 or
2.0.11 I believe).

2016-01-21 14:15 GMT+01:00 Sebastian Estevez <sebastian.este...@datastax.com
>:

> >So compaction of one table will NOT spread over different cores.
>
> This is not exactly true. You actually can have multiple compactions
> running at the same time on the same table, it just doesn't happen all that
> often. You essentially would have to have two sets of sstables that are
> both eligible for compactions at the same time.
>
> all the best,
>
> Sebastián
> On Jan 21, 2016 7:41 AM, "PenguinWhispererThe ." <
> th3penguinwhispe...@gmail.com> wrote:
>
>> After having some issues myself with compaction I think it's noteworthy
>> to explicitly state that compaction of a table can only run on one CPU. So
>> compaction of one table will NOT spread over different cores.
>> To really have use of concurrent_compactors you need to have multiple
>> table compactions initiated at the same time. If those are small they'll
>> finish way earlier resulting in only one core using 100% as compaction is
>> generally CPU bound (unless your disks can't keep up).
>> I believe it's better to be CPU(core) bound on one core(or at least not
>> all) for compaction than disk IO bound as this would result in writes and
>> reads, ... having performance impact.
>> Compaction is a maintenance task so it shouldn't be eating all your
>> resources.
>>
>>
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
>>  This
>> email has been sent from a virus-free computer protected by Avast.
>> www.avast.com
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
>> <#1162782367_-1582318301_DDB4FAA8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>
>> 2016-01-16 0:18 GMT+01:00 Kai Wang <dep...@gmail.com>:
>>
>>> Jeff & Sebastian,
>>>
>>> Thanks for the reply. There are 12 cores but in my case C* only uses one
>>> core most of the time. *nodetool compactionstats* shows there's only
>>> one compactor running. I can see C* process only uses one core. So I guess
>>> I should've asked the question more clearly:
>>>
>>> 1. Is ~25 M/s a reasonable compaction throughput for one core?
>>> 2. Is there any configuration that affects single core compaction
>>> throughput?
>>> 3. Is concurrent_compactors the only option to parallelize compaction?
>>> If so, I guess it's the compaction strategy itself that decides when to
>>> parallelize and when to block on one core. Then there's not much we can do
>>> here.
>>>
>>> Thanks.
>>>
>>> On Fri, Jan 15, 2016 at 5:23 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
>>> wrote:
>>>
>>>> With SSDs, the typical recommendation is up to 0.8-1 compactor per core
>>>> (depending on other load).  How many CPU cores do you have?
>>>>
>>>>
>>>> From: Kai Wang
>>>> Reply-To: "user@cassandra.apache.org"
>>>> Date: Friday, January 15, 2016 at 12:53 PM
>>>> To: "user@cassandra.apache.org"
>>>> Subject: compaction throughput
>>>>
>>>> Hi,
>>>>
>>>> I am trying to figure out the bottleneck of compaction on my node. The
>>>> node is CentOS 7 and has SSDs installed. The table is configured to use
>>>> LCS. Here is my compaction related configs in cassandra.yaml:
>>>>
>>>> compaction_throughput_mb_per_sec: 160
>>>> concurrent_compactors: 4
>>>>
>>>> I insert about 10G of data and start observing compaction.
>>>>
>>>> *nodetool compaction* shows most of time there is one compaction.
>>>> Sometimes there are 3-4 (I suppose this is controlled by
>>>> concurrent_compactors). During the compaction, I see one CPU core is 100%.
>>>> At that point, disk IO is about 20-25 M/s write which is much lower than
>>>> the disk is capable of. Even when there are 4 compactions running, I see
>>>> CPU go to +400% but disk IO is still at 20-25M/s write. I use *nodetool
>>>> setcompactionthroughput 0* to disable the compaction throttle but
>>>> don't see any difference.
>>>>
>>>> Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is
>>>> there anyway to improve the throughput?
>>>>
>>>> Thanks.
>>>>
>>>
>>>
>>

Re: compaction throughput

Reply via email to