Re: MR against snapshot causes High CPU usage on Datanodes

Michael Segel Wed, 13 May 2015 10:54:12 -0700

So … 

First, you’re wasting money on 10K drives. But that could be your company’s 
standard.


Yes, you’re going to see red. 
 
24 / 12 , so is that 12 physical cores  or 24 physical cores? 

I suspect those are dual chipped w 6 physical cores per chip. 
That’s 12 cores to 12 disks, which is ok. 

The 40 or 20 cores to 12 drives… that’s going to cause you trouble. 

Note: Seeing high levels of CPU may not be a bad thing. 

7-8 mappers per node?  Not a lot of work for the number of cores… 



> On May 13, 2015, at 12:31 PM, rahul malviya <[email protected]> 
> wrote:
> 
> *How many mapper/reducers are running per node for this job?*
> I am running 7-8 mappers per node. The spike is seen in mapper phase so no
> reducers where running at that point of time.
> 
> *Also how many mappers are running as data local mappers?*
> How to determine this ?
> 
> 
> * You load/data equally distributed?*
> Yes as we use presplit hash keys in our hbase cluster and data is pretty
> evenly distributed.
> 
> Thanks,
> Rahul
> 
> 
> On Wed, May 13, 2015 at 10:25 AM, Anil Gupta <[email protected]> wrote:
> 
>> How many mapper/reducers are running per node for this job?
>> Also how many mappers are running as data local mappers?
>> You load/data equally distributed?
>> 
>> Your disk, cpu ratio looks ok.
>> 
>> Sent from my iPhone
>> 
>>> On May 13, 2015, at 10:12 AM, rahul malviya <[email protected]>
>> wrote:
>>> 
>>> *The High CPU may be WAIT IOs,  which would mean that you’re cpu is
>> waiting
>>> for reads from the local disks.*
>>> 
>>> Yes I think thats what is going on but I am trying to understand why it
>>> happens only in case of snapshot MR but if I run the same job without
>> using
>>> snapshot everything is normal. What is the difference in snapshot version
>>> which can cause such a spike ? I looking through the code for snapshot
>>> version if I can find something.
>>> 
>>> cores / disks == 24 / 12 or 40 / 12.
>>> 
>>> We are using 10K sata drives on our datanodes.
>>> 
>>> Rahul
>>> 
>>> On Wed, May 13, 2015 at 10:00 AM, Michael Segel <
>> [email protected]>
>>> wrote:
>>> 
>>>> Without knowing your exact configuration…
>>>> 
>>>> The High CPU may be WAIT IOs,  which would mean that you’re cpu is
>> waiting
>>>> for reads from the local disks.
>>>> 
>>>> What’s the ratio of cores (physical) to disks?
>>>> What type of disks are you using?
>>>> 
>>>> That’s going to be the most likely culprit.
>>>>>> On May 13, 2015, at 11:41 AM, rahul malviya <
>> [email protected]>
>>>>> wrote:
>>>>> 
>>>>> Yes.
>>>>> 
>>>>>> On Wed, May 13, 2015 at 9:40 AM, Ted Yu <[email protected]> wrote:
>>>>>> 
>>>>>> Have you enabled short circuit read ?
>>>>>> 
>>>>>> Cheers
>>>>>> 
>>>>>> On Wed, May 13, 2015 at 9:37 AM, rahul malviya <
>>>> [email protected]
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> I have recently started running MR on hbase snapshots but when the MR
>>>> is
>>>>>>> running there is pretty high CPU usage on datanodes and I start
>> seeing
>>>> IO
>>>>>>> wait message in datanode logs and as soon I kill the MR on Snapshot
>>>>>>> everything come back to normal.
>>>>>>> 
>>>>>>> What could be causing this ?
>>>>>>> 
>>>>>>> I am running cdh5.2.0 distribution.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Rahul
>>>> 
>>>> 
>>

Re: MR against snapshot causes High CPU usage on Datanodes

Reply via email to