Re: High CPU usage during repair

aaron morton Mon, 11 Feb 2013 01:30:42 -0800

> What machine size?
> m1.large 
If you are seeing high CPU move to an m1.xlarge, that's the sweet spot.


> That's normally ok. How many are waiting?
> 
> I have seen 4 this morning 
That's not really abnormal. 
The pending task count goes when when a file *may* be eligible for compaction, 
not when there is a compaction task waiting. 

If you suddenly create a number of new SSTables for a CF the pending count will 
rise, however one of the tasks may compact all the sstables waiting for 
compaction. So the count will suddenly drop as well. 

> Just to make sure I understand you correctly, you suggest that I change 
> throughput to 12 regardless of whether repair is ongoing or not. I will do it 
> using nodetool and change the yaml file in case a restart will occur in the 
> future? 
Yes. 
If you are seeing performance degrade during compaction or repair try reducing 
the throughput. 

I would attribute most of the problems you have described to using m1.large. 

Cheers
 

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 11/02/2013, at 9:16 AM, Tamar Fraenkel <ta...@tok-media.com> wrote:

> Hi!
> Thanks for the response.
> See my answers and questions below.
> Thanks!
> Tamar
> 
> Tamar Fraenkel 
> Senior Software Engineer, TOK Media 
> 
> <tokLogo.png>
> 
> ta...@tok-media.com
> Tel:   +972 2 6409736 
> Mob:  +972 54 8356490 
> Fax:   +972 2 5612956 
> 
> 
> 
> 
> On Sun, Feb 10, 2013 at 10:04 PM, aaron morton <aa...@thelastpickle.com> 
> wrote:
>> During repair I see high CPU consumption, 
> Repair reads the data and computes a hash, this is a CPU intensive operation.
> Is the CPU over loaded or is just under load?
>  Usually just load, but in the past two weeks I have seen CPU of over 90%!
>> I run Cassandra  version 1.0.11, on 3 node setup on EC2 instances.
> 
> What machine size?
> m1.large 
> 
>> there are compactions waiting.
> That's normally ok. How many are waiting?
> 
> I have seen 4 this morning 
>> I thought of adding a call to my repair script, before repair starts to do:
>> nodetool setcompactionthroughput 0
>> and then when repair finishes call
>> nodetool setcompactionthroughput 16
> That will remove throttling on compaction and the validation compaction used 
> for the repair. Which may in turn add additional IO load, CPU load and GC 
> pressure. You probably do not want to do this. 
> 
> Try reducing the compaction throughput to say 12 normally and see the effect.
> 
> Just to make sure I understand you correctly, you suggest that I change 
> throughput to 12 regardless of whether repair is ongoing or not. I will do it 
> using nodetool and change the yaml file in case a restart will occur in the 
> future? 
> Cheers
> 
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 11/02/2013, at 1:01 AM, Tamar Fraenkel <ta...@tok-media.com> wrote:
> 
>> Hi!
>> I run repair weekly, using a scheduled cron job.
>> During repair I see high CPU consumption, and messages in the log file
>> "INFO [ScheduledTasks:1] 2013-02-10 11:48:06,396 GCInspector.java (line 122) 
>> GC for ParNew: 208 ms for 1 collections, 1704786200 used; max is 3894411264"
>> From time to time, there are also messages of the form
>> "INFO [ScheduledTasks:1] 2012-12-04 13:34:52,406 MessagingService.java (line 
>> 607) 1 READ messages dropped in last 5000ms"
>> 
>> Using opscenter, jmx and nodetool compactionstats I can see that during the 
>> time the CPU consumption is high, there are compactions waiting.
>> 
>> I run Cassandra  version 1.0.11, on 3 node setup on EC2 instances.
>> I have the default settings:
>> compaction_throughput_mb_per_sec: 16
>> in_memory_compaction_limit_in_mb: 64
>> multithreaded_compaction: false
>> compaction_preheat_key_cache: true
>> 
>> I am thinking on the following solution, and wanted to ask if I am on the 
>> right track:
>> I thought of adding a call to my repair script, before repair starts to do:
>> nodetool setcompactionthroughput 0
>> and then when repair finishes call
>> nodetool setcompactionthroughput 16
>> 
>> Is this a right solution?
>> Thanks,
>> Tamar
>> 
>> Tamar Fraenkel 
>> Senior Software Engineer, TOK Media 
>> 
>> <tokLogo.png>
>> 
>> 
>> ta...@tok-media.com
>> Tel:   +972 2 6409736 
>> Mob:  +972 54 8356490 
>> Fax:   +972 2 5612956 
>> 
>> 
> 
>

Re: High CPU usage during repair

Reply via email to