I had to decrease streaming throughput to 10 (from default 200) in order to
avoid effect or rising number of SSTables and number of compaction tasks
while running repair. It's working very slow but it's stable and doesn't
hurt the whole cluster. Will try to adjust configuration gradually to see
if can make it any better. Thanks!

On Thu, Feb 11, 2016 at 8:10 PM, Michał Łowicki <mlowi...@gmail.com> wrote:

>
>
> On Thu, Feb 11, 2016 at 5:38 PM, Alain RODRIGUEZ <arodr...@gmail.com>
> wrote:
>
>> Also, are you using incremental repairs (not sure about the available
>> options in Spotify Reaper) what command did you run ?
>>
>>
> No.
>
>
>> 2016-02-11 17:33 GMT+01:00 Alain RODRIGUEZ <arodr...@gmail.com>:
>>
>>> CPU load is fine, SSD disks below 30% utilization, no long GC pauses
>>>
>>>
>>>
>>> What is your current compaction throughput ?  The current value of
>>> 'concurrent_compactors' (cassandra.yaml or through JMX) ?
>>>
>>
>
> Throughput was initially set to 1024 and I've gradually increased it to
> 2048, 4K and 16K but haven't seen any changes. Tried to change it both from
> `nodetool` and also cassandra.yaml (with restart after changes).
>
>
>>
>>> nodetool getcompactionthroughput
>>>
>>> How to speed up compaction? Increased compaction throughput and
>>>> concurrent compactors but no change. Seems there is plenty idle
>>>> resources but can't force C* to use it.
>>>>
>>>
>>> You might want to try un-throttle the compaction throughput through:
>>>
>>> nodetool setcompactionsthroughput 0
>>>
>>> Choose a canari node. Monitor compaction pending and disk throughput
>>> (make sure server is ok too - CPU...)
>>>
>>
>
> Yes, I'll try it out but if increasing it 16 times didn't help I'm a bit
> sceptical about it.
>
>
>>
>>> Some other information could be useful:
>>>
>>> What is your number of cores per machine and the compaction strategies
>>> for the 'most compacting' tables. What are write/update patterns, any TTL
>>> or tombstones ? Do you use a high number of vnodes ?
>>>
>>
> I'm using bare-metal box, 40CPU, 64GB, 2 SSD each. num_tokens is set to
> 256.
>
> Using LCS for all tables. Write / update heavy. No warnings about large
> number of tombstones but we're removing items frequently.
>
>
>
>>
>>> Also what is your repair routine and your values for gc_grace_seconds ?
>>> When was your last repair and do you think your cluster is suffering of a
>>> high entropy ?
>>>
>>
> We're having problem with repair for months (CASSANDRA-9935).
> gc_grace_seconds is set to 345600 now. Yes, as we haven't launched it
> successfully for long time I guess cluster is suffering of high entropy.
>
>
>>
>>> You can lower the stream throughput to make sure nodes can cope with
>>> what repairs are feeding them.
>>>
>>> nodetool getstreamthroughput
>>> nodetool setstreamthroughput X
>>>
>>
> Yes, this sounds interesting. As we're having problem with repair for
> months it could that lots of things are transferred between nodes.
>
> Thanks!
>
>
>>
>>> C*heers,
>>>
>>> -----------------
>>> Alain Rodriguez
>>> France
>>>
>>> The Last Pickle
>>> http://www.thelastpickle.com
>>>
>>> 2016-02-11 16:55 GMT+01:00 Michał Łowicki <mlowi...@gmail.com>:
>>>
>>>> Hi,
>>>>
>>>> Using 2.1.12 across 3 DCs. Each DC has 8 nodes. Trying to run repair
>>>> using Cassandra Reaper but nodes after couple of hours are full of pending
>>>> compaction tasks (regular not the ones about validation)
>>>>
>>>> CPU load is fine, SSD disks below 30% utilization, no long GC pauses.
>>>>
>>>> How to speed up compaction? Increased compaction throughput and
>>>> concurrent compactors but no change. Seems there is plenty idle
>>>> resources but can't force C* to use it.
>>>>
>>>> Any clue where there might be a bottleneck?
>>>>
>>>>
>>>> --
>>>> BR,
>>>> Michał Łowicki
>>>>
>>>>
>>>
>>
>
>
> --
> BR,
> Michał Łowicki
>



-- 
BR,
Michał Łowicki

Reply via email to