Re: MR against snapshot causes High CPU usage on Datanodes

rahul malviya Wed, 13 May 2015 10:12:45 -0700

*The High CPU may be WAIT IOs,  which would mean that you’re cpu is waiting
for reads from the local disks.*


Yes I think thats what is going on but I am trying to understand why it
happens only in case of snapshot MR but if I run the same job without using
snapshot everything is normal. What is the difference in snapshot version
which can cause such a spike ? I looking through the code for snapshot
version if I can find something.

cores / disks == 24 / 12 or 40 / 12.

We are using 10K sata drives on our datanodes.

Rahul

On Wed, May 13, 2015 at 10:00 AM, Michael Segel <[email protected]>
wrote:

> Without knowing your exact configuration…
>
> The High CPU may be WAIT IOs,  which would mean that you’re cpu is waiting
> for reads from the local disks.
>
> What’s the ratio of cores (physical) to disks?
> What type of disks are you using?
>
> That’s going to be the most likely culprit.
> > On May 13, 2015, at 11:41 AM, rahul malviya <[email protected]>
> wrote:
> >
> > Yes.
> >
> > On Wed, May 13, 2015 at 9:40 AM, Ted Yu <[email protected]> wrote:
> >
> >> Have you enabled short circuit read ?
> >>
> >> Cheers
> >>
> >> On Wed, May 13, 2015 at 9:37 AM, rahul malviya <
> [email protected]
> >>>
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> I have recently started running MR on hbase snapshots but when the MR
> is
> >>> running there is pretty high CPU usage on datanodes and I start seeing
> IO
> >>> wait message in datanode logs and as soon I kill the MR on Snapshot
> >>> everything come back to normal.
> >>>
> >>> What could be causing this ?
> >>>
> >>> I am running cdh5.2.0 distribution.
> >>>
> >>> Thanks,
> >>> Rahul
> >>>
> >>
>
>

Re: MR against snapshot causes High CPU usage on Datanodes

Reply via email to