Re: MR against snapshot causes High CPU usage on Datanodes

Anil Gupta Wed, 13 May 2015 10:26:35 -0700

How many mapper/reducers are running per node for this job?
Also how many mappers are running as data local mappers?
You load/data equally distributed?


Your disk, cpu ratio looks ok. 

Sent from my iPhone

> On May 13, 2015, at 10:12 AM, rahul malviya <[email protected]> 
> wrote:
> 
> *The High CPU may be WAIT IOs,  which would mean that you’re cpu is waiting
> for reads from the local disks.*
> 
> Yes I think thats what is going on but I am trying to understand why it
> happens only in case of snapshot MR but if I run the same job without using
> snapshot everything is normal. What is the difference in snapshot version
> which can cause such a spike ? I looking through the code for snapshot
> version if I can find something.
> 
> cores / disks == 24 / 12 or 40 / 12.
> 
> We are using 10K sata drives on our datanodes.
> 
> Rahul
> 
> On Wed, May 13, 2015 at 10:00 AM, Michael Segel <[email protected]>
> wrote:
> 
>> Without knowing your exact configuration…
>> 
>> The High CPU may be WAIT IOs,  which would mean that you’re cpu is waiting
>> for reads from the local disks.
>> 
>> What’s the ratio of cores (physical) to disks?
>> What type of disks are you using?
>> 
>> That’s going to be the most likely culprit.
>>>> On May 13, 2015, at 11:41 AM, rahul malviya <[email protected]>
>>> wrote:
>>> 
>>> Yes.
>>> 
>>>> On Wed, May 13, 2015 at 9:40 AM, Ted Yu <[email protected]> wrote:
>>>> 
>>>> Have you enabled short circuit read ?
>>>> 
>>>> Cheers
>>>> 
>>>> On Wed, May 13, 2015 at 9:37 AM, rahul malviya <
>> [email protected]
>>>> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> I have recently started running MR on hbase snapshots but when the MR
>> is
>>>>> running there is pretty high CPU usage on datanodes and I start seeing
>> IO
>>>>> wait message in datanode logs and as soon I kill the MR on Snapshot
>>>>> everything come back to normal.
>>>>> 
>>>>> What could be causing this ?
>>>>> 
>>>>> I am running cdh5.2.0 distribution.
>>>>> 
>>>>> Thanks,
>>>>> Rahul
>> 
>>

Re: MR against snapshot causes High CPU usage on Datanodes

Reply via email to