Re: disable compaction in bootstrap process

Alain RODRIGUEZ Fri, 23 Mar 2018 05:58:20 -0700

>
> I mean to disable Compaction in the bootstrapping process,then enable it
> after the bootstrapping.



That's how I understood it :-). Bootstrap can take a relatively long time
and could affect all the nodes when using vnodes. Disabling compactions for
hours is risky, even more, if the cluster is somewhat under pressure
already.  My point is, it might work for you, but it might also bring a
whole lot of other issues, starting with increasing latencies. Plus all
this compaction work on hold will have to be performed, at some point,
later.

You asked if it is 'reasonable', I would say no unless you know for sure
the cluster will handle it properly. Here is what I think would be a
reasonable approach:

Before going for solutions, and especially this solution, it is important
to understand the limitations, to find the bottleneck or the root cause of
the troubles. In a healthy cluster, a node can handle streaming the data,
compacting and answering client requests. Once what is wrong is clear, it
will be way easier to think about possible solutions and pick the best one.
For now, we are only making guesses. Taking quick actions after making
wrong guesses and without fully understanding consequences is where I saw
the most damages being done to Cassandra clusters. I did that too, I don't
recommend :-).

we are painful in bootstrap/rebuild/remove node.


As you express the cluster is having troubles with streaming
operations ('bootstrap/rebuild/remove
node'), you can try reducing the streaming throughput. There is no rush in
adding the new nodes as long as the other nodes are healthier meanwhile.
Thus reducing the speed will reduce the pressure on the disk (mostly). This
change should not harm in any case, just make things slower. This can be a
reasonable try (keeping in mind it is a workaround and there is probably an
underlying issue if you are using defaults).

'nodetool getstreamthroughput'
'nodetool setstreamthroughput x'

Default x is 200 (I believe). If using vnodes, do not be afraid to lower
this quite a lot, as all the nodes are probably involved in the streaming
process.

But again, until we know more about the metrics or the context, we are
mostly guessing. With the following information, we could probably help
more efficiently:

- What are the values of '*concurrent_compactors'* and '
*compaction_throughput_in_mb'* in use? (cassandra.yaml)
- Is the cluster CPU or Disk bounded? (system tools htop / charts, etc.
What is the cpu load, % of cpu used, some io_wait ?)
- Is compaction keeping up? ('*nodetool compactionstats -H*')
- What compactions strategy are you using? (Table definition - ie '*echo
"DESCRIBE TABLE keyspace.table;" | grep -i compaction*')
- 'nodetool tpstats' might also give information on pending and dropped
tasks.
- 'nodetool cfstats' could help as well

In any case, good luck ;-)

C*heers,
-----------------------
Alain Rodriguez - @arodream - al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2018-03-23 2:59 GMT+00:00 Peng Xiao <2535...@qq.com>:

> Sorry Alain,maybe some misunderstanding here,I mean to disable Compaction
> in the bootstrapping process,then enable it after the bootstrapping.
>
>
> ------------------ 原始邮件 ------------------
> *发件人:* "我自己的邮箱"<2535...@qq.com>;
> *发送时间:* 2018年3月23日(星期五) 上午10:54
> *收件人:* "user"<user@cassandra.apache.org>;
> *主题:* 回复： disable compaction in bootstrap process
>
> Thanks Alain.We are using C* 2.1.18,7core/30G/1.5T ssd,as the cluster is
> growing too fast,we are painful in bootstrap/rebuild/remove node.
>
> Thanks,
> Peng Xiao
>
> ------------------ 原始邮件 ------------------
> *发件人:* "Alain RODRIGUEZ"<arodr...@gmail.com>;
> *发送时间:* 2018年3月22日(星期四) 晚上7:31
> *收件人:* "user cassandra.apache.org"<user@cassandra.apache.org>;
> *主题:* Re: disable compaction in bootstrap process
>
> Hello,
>
>
>> Is it reasonable to disable compaction on all the source node?
>
>
> I would say no, as a short answer.
>
> You can, I did it for some operations in the past. Technically no problem
> you can do that. It will most likely improve the response time of the
> queries immediately as it seems that in your cluster compactions are
> impacting the transactions.
>
> That being said, the impact in the middle/long term will be substantially
> worst. Compactions allow fragments of rows to be merged so the reads can be
> more efficient, hitting the disk just once ideally (at least to reach a
> reasonably low number of hits on the disk). Also, when enabling compactions
> back you might have troubles again as compaction will have to catch up.
>
> Imho, disabling compaction should be an action to take unless your
> understanding about compaction is good enough and you are in a very
> specific case that requires it.
> In any case, I would recommend you to stay away from using this solution
> as a quick workaround. It could lead to really wrong situations. Without
> mentioning tombstones that would stack there. Plus, doing this on all the
> nodes at once is really calling for troubles as all the nodes performances
> might degrade at the same pace, roughly.
>
> I would suggest a troubleshooting on why compactions are actually
> impacting the read/write performances.
>
> We probably can help with this here as I believe all the Cassandra users
> had to deal with this at some point (at least people running with 'limited'
> hardware compared to the needs).
>
> Here are some questions that I believe might be useful for us to help you
> or even for you to troubleshoot.
>
> - Is Cassandra limiting thing or resources reaching a limit?
> - Is the cluster CPU or Disk bounded?
> - What are the number of concurrent compactors and compaction speed in use?
> - What hardware are you relying on?
> - What version are you using?
> - Is compaction keeping up? What compactions strategy are you using?
> - 'nodetool tpstats' might also give information on pending and dropped
> tasks. It might be useful.
>
> C*heers,
> -----------------------
> Alain Rodriguez - @arodream - al...@thelastpickle.com
> France / Spain
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> 2018-03-22 9:09 GMT+00:00 Peng Xiao <2535...@qq.com>:
>
>> Dear All,
>>
>> We noticed that when bootstrap new node,the source node is also quite
>> busy doing compactions which impact the rt severely.Is it reasonable to
>> disable compaction on all the source node?
>>
>> Thanks,
>> Peng Xiao
>>
>
>

Re: disable compaction in bootstrap process

Reply via email to