So … First, you’re wasting money on 10K drives. But that could be your company’s standard.
Yes, you’re going to see red. 24 / 12 , so is that 12 physical cores or 24 physical cores? I suspect those are dual chipped w 6 physical cores per chip. That’s 12 cores to 12 disks, which is ok. The 40 or 20 cores to 12 drives… that’s going to cause you trouble. Note: Seeing high levels of CPU may not be a bad thing. 7-8 mappers per node? Not a lot of work for the number of cores… > On May 13, 2015, at 12:31 PM, rahul malviya <[email protected]> > wrote: > > *How many mapper/reducers are running per node for this job?* > I am running 7-8 mappers per node. The spike is seen in mapper phase so no > reducers where running at that point of time. > > *Also how many mappers are running as data local mappers?* > How to determine this ? > > > * You load/data equally distributed?* > Yes as we use presplit hash keys in our hbase cluster and data is pretty > evenly distributed. > > Thanks, > Rahul > > > On Wed, May 13, 2015 at 10:25 AM, Anil Gupta <[email protected]> wrote: > >> How many mapper/reducers are running per node for this job? >> Also how many mappers are running as data local mappers? >> You load/data equally distributed? >> >> Your disk, cpu ratio looks ok. >> >> Sent from my iPhone >> >>> On May 13, 2015, at 10:12 AM, rahul malviya <[email protected]> >> wrote: >>> >>> *The High CPU may be WAIT IOs, which would mean that you’re cpu is >> waiting >>> for reads from the local disks.* >>> >>> Yes I think thats what is going on but I am trying to understand why it >>> happens only in case of snapshot MR but if I run the same job without >> using >>> snapshot everything is normal. What is the difference in snapshot version >>> which can cause such a spike ? I looking through the code for snapshot >>> version if I can find something. >>> >>> cores / disks == 24 / 12 or 40 / 12. >>> >>> We are using 10K sata drives on our datanodes. >>> >>> Rahul >>> >>> On Wed, May 13, 2015 at 10:00 AM, Michael Segel < >> [email protected]> >>> wrote: >>> >>>> Without knowing your exact configuration… >>>> >>>> The High CPU may be WAIT IOs, which would mean that you’re cpu is >> waiting >>>> for reads from the local disks. >>>> >>>> What’s the ratio of cores (physical) to disks? >>>> What type of disks are you using? >>>> >>>> That’s going to be the most likely culprit. >>>>>> On May 13, 2015, at 11:41 AM, rahul malviya < >> [email protected]> >>>>> wrote: >>>>> >>>>> Yes. >>>>> >>>>>> On Wed, May 13, 2015 at 9:40 AM, Ted Yu <[email protected]> wrote: >>>>>> >>>>>> Have you enabled short circuit read ? >>>>>> >>>>>> Cheers >>>>>> >>>>>> On Wed, May 13, 2015 at 9:37 AM, rahul malviya < >>>> [email protected] >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I have recently started running MR on hbase snapshots but when the MR >>>> is >>>>>>> running there is pretty high CPU usage on datanodes and I start >> seeing >>>> IO >>>>>>> wait message in datanode logs and as soon I kill the MR on Snapshot >>>>>>> everything come back to normal. >>>>>>> >>>>>>> What could be causing this ? >>>>>>> >>>>>>> I am running cdh5.2.0 distribution. >>>>>>> >>>>>>> Thanks, >>>>>>> Rahul >>>> >>>> >>
