Re: Increase compaction performance

2016-05-20 Thread Fabrice Facorat
@Alain:

Indeed when repairing (or bootstraping) all sstables end up in L0 as
original level is not passed down to the node. So cassandra end up
compacting a lot of sstables in L0 before trying to make them move to upper
levels.

The issue still exist in 2.1 and is even worse as you have less concurrent
comapctor available ('we had 24 with our 24 cores servers).

Presently (with cassandra 2.1) we set concurrent compaction to 4 and
compactionthroughput to 128MB/s as this si the setup that use less CPU.

I will try to reduce the number of replicats sending sstables to the node
by switching to sequential repairs (we were using // repairs), and really
if this is not enough, we will try to throttle streamthroughput (and reduce
also number of LCS tables)

Thanks :)



2016-03-10 11:26 GMT+01:00 Alain RODRIGUEZ :

> Hi Michal,
>
> Sorry about the delay answering here.
>
> The value you gave (10) looks a lot like what I had to do in the past in
> the cluster I managed. I described the issue here:
> https://issues.apache.org/jira/browse/CASSANDRA-9509
>
> A few people hit this issue already. Hope you were able to successfully
> complete the first repair correctly without harming the cluster too much.
>
> @Fabrice,
>
> People answering the list probably missed your post as it was on an open
> thread and I have been out so I missed it too.
>
> I would:
>
> Set concurrent compactors to 8 (max) - Can be updated through JMX
> Set compaction throughput to 32, 64 or even 0 (go incrementally and on one
> node first). Use 0 if you have SSD, unless you'll probably make the disk
> throughput a bottleneck - Can be updated through nodetool
> Set stream throughput to 10 Mb/s - Can be updated through nodetool (by the
> way it is Mb and not MB)
>
> Monitor resources and number of sstable, see how it goes.
>
> You're also probably hitting
> https://issues.apache.org/jira/browse/CASSANDRA-9509.
>
> Also using LCS, I read (but was not able to find the reference or the fix
> version) that repaired data was put back in L0, inducing even more
> compactions. Have no more info about this, but upgrading to 2.0.Last is
> needed and I would probably go 2.1.last as a lot of stuff around repairs
> were fixed there and Cassandra 2.0 is no longer supported.
>
> Hope you will find a way to mitigate thing though, or already have. Bonne
> chance ;-).
>
> C*heers,
> ---
> Alain Rodriguez - al...@thelastpickle.com
> France
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
>
>
>
> 2016-03-04 16:55 GMT+01:00 Fabrice Facorat :
>
>> Any news on this ?
>>
>> We also have issues during repairs when using many LCS tables. We end
>> up with 8k sstables, many pending tasks and dropped mutations
>>
>> We are using Cassandra 2.0.10, on a 24 cores server, with
>> multithreaded compactions enabled.
>>
>> ~$ nodetool getstreamthroughput
>> Current stream throughput: 200 MB/s
>>
>> ~$ nodetool getcompactionthroughput
>> Current compaction throughput: 16 MB/s
>>
>> Most sstables are tiny 4K or 8K/12K sstables:
>>
>> ~$ ls -sh /var/lib/cassandra/data//xxx/*-Data.db | grep -Ev 'M' | wc
>> -l
>> 7405
>> ~$ ls -sh /var/lib/cassandra/data//xxx/*-Data.db | wc -l
>> 7440
>>
>> ~$ ls -sh /var/lib/cassandra/data//xxx/*-Data.db | grep -Ev 'M' |
>> cut -f1 -d" " | sort | uniq -c
>>  36
>>7003 4.0K
>> 396 8.0K
>>
>>
>> Pool NameActive   Pending  Completed   Blocked
>>  All time blocked
>> ReadStage 0 0  258098148 0
>> 0
>> RequestResponseStage  0 0  613994884 0
>> 0
>> MutationStage 0 0  332242206 0
>> 0
>> ReadRepairStage   0 03360040 0
>> 0
>> ReplicateOnWriteStage 0 0  0 0
>> 0
>> GossipStage   0 02471033 0
>> 0
>> CacheCleanupExecutor  0 0  0 0
>> 0
>> MigrationStage0 0  0 0
>> 0
>> MemoryMeter   0 0  25160 0
>> 0
>> FlushWriter   1 1 134083 0
>>   521
>> ValidationExecutor1 1  89514 0
>> 0
>> InternalResponseStage 0 0  0 0
>> 0
>> AntiEntropyStage  0 0 636471 0
>> 0
>> MemtablePostFlusher   1 1 334667 0
>> 0
>> MiscStage 0 0  0 0
>> 0
>> PendingRangeCalculator0 0  

Re: Increase compaction performance

2016-03-04 Thread Fabrice Facorat
Any news on this ?

We also have issues during repairs when using many LCS tables. We end
up with 8k sstables, many pending tasks and dropped mutations

We are using Cassandra 2.0.10, on a 24 cores server, with
multithreaded compactions enabled.

~$ nodetool getstreamthroughput
Current stream throughput: 200 MB/s

~$ nodetool getcompactionthroughput
Current compaction throughput: 16 MB/s

Most sstables are tiny 4K or 8K/12K sstables:

~$ ls -sh /var/lib/cassandra/data//xxx/*-Data.db | grep -Ev 'M' | wc -l
7405
~$ ls -sh /var/lib/cassandra/data//xxx/*-Data.db | wc -l
7440

~$ ls -sh /var/lib/cassandra/data//xxx/*-Data.db | grep -Ev 'M' |
cut -f1 -d" " | sort | uniq -c
 36
   7003 4.0K
396 8.0K


Pool NameActive   Pending  Completed   Blocked
 All time blocked
ReadStage 0 0  258098148 0
0
RequestResponseStage  0 0  613994884 0
0
MutationStage 0 0  332242206 0
0
ReadRepairStage   0 03360040 0
0
ReplicateOnWriteStage 0 0  0 0
0
GossipStage   0 02471033 0
0
CacheCleanupExecutor  0 0  0 0
0
MigrationStage0 0  0 0
0
MemoryMeter   0 0  25160 0
0
FlushWriter   1 1 134083 0
  521
ValidationExecutor1 1  89514 0
0
InternalResponseStage 0 0  0 0
0
AntiEntropyStage  0 0 636471 0
0
MemtablePostFlusher   1 1 334667 0
0
MiscStage 0 0  0 0
0
PendingRangeCalculator0 0181 0
0
commitlog_archiver0 0  0 0
0
CompactionExecutor   24245241768 0
0
AntiEntropySessions   0 0  15184 0
0
HintedHandoff 0 0278 0
0

Message type   Dropped
RANGE_SLICE  0
READ_REPAIR267
PAGED_RANGE  0
BINARY   0
READ 0
MUTATION150970
_TRACE   0
REQUEST_RESPONSE 0
COUNTER_MUTATION 0


2016-02-12 20:08 GMT+01:00 Michał Łowicki :
> I had to decrease streaming throughput to 10 (from default 200) in order to
> avoid effect or rising number of SSTables and number of compaction tasks
> while running repair. It's working very slow but it's stable and doesn't
> hurt the whole cluster. Will try to adjust configuration gradually to see if
> can make it any better. Thanks!
>
> On Thu, Feb 11, 2016 at 8:10 PM, Michał Łowicki  wrote:
>>
>>
>>
>> On Thu, Feb 11, 2016 at 5:38 PM, Alain RODRIGUEZ 
>> wrote:
>>>
>>> Also, are you using incremental repairs (not sure about the available
>>> options in Spotify Reaper) what command did you run ?
>>>
>>
>> No.
>>
>>>
>>> 2016-02-11 17:33 GMT+01:00 Alain RODRIGUEZ :
>
> CPU load is fine, SSD disks below 30% utilization, no long GC pauses



 What is your current compaction throughput ?  The current value of
 'concurrent_compactors' (cassandra.yaml or through JMX) ?
>>
>>
>>
>> Throughput was initially set to 1024 and I've gradually increased it to
>> 2048, 4K and 16K but haven't seen any changes. Tried to change it both from
>> `nodetool` and also cassandra.yaml (with restart after changes).
>>


 nodetool getcompactionthroughput

> How to speed up compaction? Increased compaction throughput and
> concurrent compactors but no change. Seems there is plenty idle resources
> but can't force C* to use it.


 You might want to try un-throttle the compaction throughput through:

 nodetool setcompactionsthroughput 0

 Choose a canari node. Monitor compaction pending and disk throughput
 (make sure server is ok too - CPU...)
>>
>>
>>
>> Yes, I'll try it out but if increasing it 16 times didn't help I'm a bit
>> sceptical about it.
>>


 Some other information could be useful:

 What is your number of cores per machine and the compaction strategies
 for the 'most compacting' tables. What are write/update patterns, any TTL 
 

Re: Increase compaction performance

2016-02-12 Thread Michał Łowicki
I had to decrease streaming throughput to 10 (from default 200) in order to
avoid effect or rising number of SSTables and number of compaction tasks
while running repair. It's working very slow but it's stable and doesn't
hurt the whole cluster. Will try to adjust configuration gradually to see
if can make it any better. Thanks!

On Thu, Feb 11, 2016 at 8:10 PM, Michał Łowicki  wrote:

>
>
> On Thu, Feb 11, 2016 at 5:38 PM, Alain RODRIGUEZ 
> wrote:
>
>> Also, are you using incremental repairs (not sure about the available
>> options in Spotify Reaper) what command did you run ?
>>
>>
> No.
>
>
>> 2016-02-11 17:33 GMT+01:00 Alain RODRIGUEZ :
>>
>>> CPU load is fine, SSD disks below 30% utilization, no long GC pauses
>>>
>>>
>>>
>>> What is your current compaction throughput ?  The current value of
>>> 'concurrent_compactors' (cassandra.yaml or through JMX) ?
>>>
>>
>
> Throughput was initially set to 1024 and I've gradually increased it to
> 2048, 4K and 16K but haven't seen any changes. Tried to change it both from
> `nodetool` and also cassandra.yaml (with restart after changes).
>
>
>>
>>> nodetool getcompactionthroughput
>>>
>>> How to speed up compaction? Increased compaction throughput and
 concurrent compactors but no change. Seems there is plenty idle
 resources but can't force C* to use it.

>>>
>>> You might want to try un-throttle the compaction throughput through:
>>>
>>> nodetool setcompactionsthroughput 0
>>>
>>> Choose a canari node. Monitor compaction pending and disk throughput
>>> (make sure server is ok too - CPU...)
>>>
>>
>
> Yes, I'll try it out but if increasing it 16 times didn't help I'm a bit
> sceptical about it.
>
>
>>
>>> Some other information could be useful:
>>>
>>> What is your number of cores per machine and the compaction strategies
>>> for the 'most compacting' tables. What are write/update patterns, any TTL
>>> or tombstones ? Do you use a high number of vnodes ?
>>>
>>
> I'm using bare-metal box, 40CPU, 64GB, 2 SSD each. num_tokens is set to
> 256.
>
> Using LCS for all tables. Write / update heavy. No warnings about large
> number of tombstones but we're removing items frequently.
>
>
>
>>
>>> Also what is your repair routine and your values for gc_grace_seconds ?
>>> When was your last repair and do you think your cluster is suffering of a
>>> high entropy ?
>>>
>>
> We're having problem with repair for months (CASSANDRA-9935).
> gc_grace_seconds is set to 345600 now. Yes, as we haven't launched it
> successfully for long time I guess cluster is suffering of high entropy.
>
>
>>
>>> You can lower the stream throughput to make sure nodes can cope with
>>> what repairs are feeding them.
>>>
>>> nodetool getstreamthroughput
>>> nodetool setstreamthroughput X
>>>
>>
> Yes, this sounds interesting. As we're having problem with repair for
> months it could that lots of things are transferred between nodes.
>
> Thanks!
>
>
>>
>>> C*heers,
>>>
>>> -
>>> Alain Rodriguez
>>> France
>>>
>>> The Last Pickle
>>> http://www.thelastpickle.com
>>>
>>> 2016-02-11 16:55 GMT+01:00 Michał Łowicki :
>>>
 Hi,

 Using 2.1.12 across 3 DCs. Each DC has 8 nodes. Trying to run repair
 using Cassandra Reaper but nodes after couple of hours are full of pending
 compaction tasks (regular not the ones about validation)

 CPU load is fine, SSD disks below 30% utilization, no long GC pauses.

 How to speed up compaction? Increased compaction throughput and
 concurrent compactors but no change. Seems there is plenty idle
 resources but can't force C* to use it.

 Any clue where there might be a bottleneck?


 --
 BR,
 Michał Łowicki


>>>
>>
>
>
> --
> BR,
> Michał Łowicki
>



-- 
BR,
Michał Łowicki


Re: Increase compaction performance

2016-02-11 Thread Michał Łowicki
On Thu, Feb 11, 2016 at 5:38 PM, Alain RODRIGUEZ  wrote:

> Also, are you using incremental repairs (not sure about the available
> options in Spotify Reaper) what command did you run ?
>
>
No.


> 2016-02-11 17:33 GMT+01:00 Alain RODRIGUEZ :
>
>> CPU load is fine, SSD disks below 30% utilization, no long GC pauses
>>
>>
>>
>> What is your current compaction throughput ?  The current value of
>> 'concurrent_compactors' (cassandra.yaml or through JMX) ?
>>
>

Throughput was initially set to 1024 and I've gradually increased it to
2048, 4K and 16K but haven't seen any changes. Tried to change it both from
`nodetool` and also cassandra.yaml (with restart after changes).


>
>> nodetool getcompactionthroughput
>>
>> How to speed up compaction? Increased compaction throughput and
>>> concurrent compactors but no change. Seems there is plenty idle
>>> resources but can't force C* to use it.
>>>
>>
>> You might want to try un-throttle the compaction throughput through:
>>
>> nodetool setcompactionsthroughput 0
>>
>> Choose a canari node. Monitor compaction pending and disk throughput
>> (make sure server is ok too - CPU...)
>>
>

Yes, I'll try it out but if increasing it 16 times didn't help I'm a bit
sceptical about it.


>
>> Some other information could be useful:
>>
>> What is your number of cores per machine and the compaction strategies
>> for the 'most compacting' tables. What are write/update patterns, any TTL
>> or tombstones ? Do you use a high number of vnodes ?
>>
>
I'm using bare-metal box, 40CPU, 64GB, 2 SSD each. num_tokens is set to
256.

Using LCS for all tables. Write / update heavy. No warnings about large
number of tombstones but we're removing items frequently.



>
>> Also what is your repair routine and your values for gc_grace_seconds ?
>> When was your last repair and do you think your cluster is suffering of a
>> high entropy ?
>>
>
We're having problem with repair for months (CASSANDRA-9935).
gc_grace_seconds is set to 345600 now. Yes, as we haven't launched it
successfully for long time I guess cluster is suffering of high entropy.


>
>> You can lower the stream throughput to make sure nodes can cope with what
>> repairs are feeding them.
>>
>> nodetool getstreamthroughput
>> nodetool setstreamthroughput X
>>
>
Yes, this sounds interesting. As we're having problem with repair for
months it could that lots of things are transferred between nodes.

Thanks!


>
>> C*heers,
>>
>> -
>> Alain Rodriguez
>> France
>>
>> The Last Pickle
>> http://www.thelastpickle.com
>>
>> 2016-02-11 16:55 GMT+01:00 Michał Łowicki :
>>
>>> Hi,
>>>
>>> Using 2.1.12 across 3 DCs. Each DC has 8 nodes. Trying to run repair
>>> using Cassandra Reaper but nodes after couple of hours are full of pending
>>> compaction tasks (regular not the ones about validation)
>>>
>>> CPU load is fine, SSD disks below 30% utilization, no long GC pauses.
>>>
>>> How to speed up compaction? Increased compaction throughput and
>>> concurrent compactors but no change. Seems there is plenty idle
>>> resources but can't force C* to use it.
>>>
>>> Any clue where there might be a bottleneck?
>>>
>>>
>>> --
>>> BR,
>>> Michał Łowicki
>>>
>>>
>>
>


-- 
BR,
Michał Łowicki


Increase compaction performance

2016-02-11 Thread Michał Łowicki
Hi,

Using 2.1.12 across 3 DCs. Each DC has 8 nodes. Trying to run repair using
Cassandra Reaper but nodes after couple of hours are full of pending
compaction tasks (regular not the ones about validation)

CPU load is fine, SSD disks below 30% utilization, no long GC pauses.

How to speed up compaction? Increased compaction throughput and concurrent
compactors but no change. Seems there is plenty idle resources but can't
force C* to use it.

Any clue where there might be a bottleneck?


-- 
BR,
Michał Łowicki


Re: Increase compaction performance

2016-02-11 Thread Alain RODRIGUEZ
Also, are you using incremental repairs (not sure about the available
options in Spotify Reaper) what command did you run ?

2016-02-11 17:33 GMT+01:00 Alain RODRIGUEZ :

> CPU load is fine, SSD disks below 30% utilization, no long GC pauses
>
>
>
> What is your current compaction throughput ?  The current value of
> 'concurrent_compactors' (cassandra.yaml or through JMX) ?
>
> nodetool getcompactionthroughput
>
> How to speed up compaction? Increased compaction throughput and concurrent
>> compactors but no change. Seems there is plenty idle resources but can't
>> force C* to use it.
>>
>
> You might want to try un-throttle the compaction throughput through:
>
> nodetool setcompactionsthroughput 0
>
> Choose a canari node. Monitor compaction pending and disk throughput (make
> sure server is ok too - CPU...)
>
> Some other information could be useful:
>
> What is your number of cores per machine and the compaction strategies for
> the 'most compacting' tables. What are write/update patterns, any TTL or
> tombstones ? Do you use a high number of vnodes ?
>
> Also what is your repair routine and your values for gc_grace_seconds ?
> When was your last repair and do you think your cluster is suffering of a
> high entropy ?
>
> You can lower the stream throughput to make sure nodes can cope with what
> repairs are feeding them.
>
> nodetool getstreamthroughput
> nodetool setstreamthroughput X
>
> C*heers,
>
> -
> Alain Rodriguez
> France
>
> The Last Pickle
> http://www.thelastpickle.com
>
> 2016-02-11 16:55 GMT+01:00 Michał Łowicki :
>
>> Hi,
>>
>> Using 2.1.12 across 3 DCs. Each DC has 8 nodes. Trying to run repair
>> using Cassandra Reaper but nodes after couple of hours are full of pending
>> compaction tasks (regular not the ones about validation)
>>
>> CPU load is fine, SSD disks below 30% utilization, no long GC pauses.
>>
>> How to speed up compaction? Increased compaction throughput and
>> concurrent compactors but no change. Seems there is plenty idle
>> resources but can't force C* to use it.
>>
>> Any clue where there might be a bottleneck?
>>
>>
>> --
>> BR,
>> Michał Łowicki
>>
>>
>


Re: Increase compaction performance

2016-02-11 Thread Alain RODRIGUEZ
>
> CPU load is fine, SSD disks below 30% utilization, no long GC pauses



What is your current compaction throughput ?  The current value of
'concurrent_compactors' (cassandra.yaml or through JMX) ?

nodetool getcompactionthroughput

How to speed up compaction? Increased compaction throughput and concurrent
> compactors but no change. Seems there is plenty idle resources but can't
> force C* to use it.
>

You might want to try un-throttle the compaction throughput through:

nodetool setcompactionsthroughput 0

Choose a canari node. Monitor compaction pending and disk throughput (make
sure server is ok too - CPU...)

Some other information could be useful:

What is your number of cores per machine and the compaction strategies for
the 'most compacting' tables. What are write/update patterns, any TTL or
tombstones ? Do you use a high number of vnodes ?

Also what is your repair routine and your values for gc_grace_seconds ?
When was your last repair and do you think your cluster is suffering of a
high entropy ?

You can lower the stream throughput to make sure nodes can cope with what
repairs are feeding them.

nodetool getstreamthroughput
nodetool setstreamthroughput X

C*heers,

-
Alain Rodriguez
France

The Last Pickle
http://www.thelastpickle.com

2016-02-11 16:55 GMT+01:00 Michał Łowicki :

> Hi,
>
> Using 2.1.12 across 3 DCs. Each DC has 8 nodes. Trying to run repair using
> Cassandra Reaper but nodes after couple of hours are full of pending
> compaction tasks (regular not the ones about validation)
>
> CPU load is fine, SSD disks below 30% utilization, no long GC pauses.
>
> How to speed up compaction? Increased compaction throughput and concurrent
> compactors but no change. Seems there is plenty idle resources but can't
> force C* to use it.
>
> Any clue where there might be a bottleneck?
>
>
> --
> BR,
> Michał Łowicki
>
>