How many mapper/reducers are running per node for this job? Also how many mappers are running as data local mappers? You load/data equally distributed?
Your disk, cpu ratio looks ok. Sent from my iPhone > On May 13, 2015, at 10:12 AM, rahul malviya <[email protected]> > wrote: > > *The High CPU may be WAIT IOs, which would mean that you’re cpu is waiting > for reads from the local disks.* > > Yes I think thats what is going on but I am trying to understand why it > happens only in case of snapshot MR but if I run the same job without using > snapshot everything is normal. What is the difference in snapshot version > which can cause such a spike ? I looking through the code for snapshot > version if I can find something. > > cores / disks == 24 / 12 or 40 / 12. > > We are using 10K sata drives on our datanodes. > > Rahul > > On Wed, May 13, 2015 at 10:00 AM, Michael Segel <[email protected]> > wrote: > >> Without knowing your exact configuration… >> >> The High CPU may be WAIT IOs, which would mean that you’re cpu is waiting >> for reads from the local disks. >> >> What’s the ratio of cores (physical) to disks? >> What type of disks are you using? >> >> That’s going to be the most likely culprit. >>>> On May 13, 2015, at 11:41 AM, rahul malviya <[email protected]> >>> wrote: >>> >>> Yes. >>> >>>> On Wed, May 13, 2015 at 9:40 AM, Ted Yu <[email protected]> wrote: >>>> >>>> Have you enabled short circuit read ? >>>> >>>> Cheers >>>> >>>> On Wed, May 13, 2015 at 9:37 AM, rahul malviya < >> [email protected] >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I have recently started running MR on hbase snapshots but when the MR >> is >>>>> running there is pretty high CPU usage on datanodes and I start seeing >> IO >>>>> wait message in datanode logs and as soon I kill the MR on Snapshot >>>>> everything come back to normal. >>>>> >>>>> What could be causing this ? >>>>> >>>>> I am running cdh5.2.0 distribution. >>>>> >>>>> Thanks, >>>>> Rahul >> >>
