It is interesting point I am entering to hbase using m/r 1) map phase consumes 100% cpu 2) reducer phase (hbase insertions) consumes ~ 15%
As I understand there are a lot of improvement could done here Can some recommend me what is the monitoring tool that is good to monitor map/reduce jobs. Thanks Oleg. On Fri, Nov 12, 2010 at 8:56 PM, Jean-Daniel Cryans <[email protected]>wrote: > The most important: > > - no swap, as is zero, none, nada > - near 0 io wait > > Then it's about making sure that you can drive your user CPU to near > 100%. If you can't, then you have a bottle neck somewhere and there's > no magical way of finding it out. It usually starts by understanding > what you're doing (is your job mostly just mapping or it's inserting > aggressively?) and then figuring via debugging or log reading what > seems to be the holdup. > > J-D > > On Thu, Nov 11, 2010 at 9:05 PM, Hari Sreekumar > <[email protected]> wrote: > > Hi, > > > > I am quite new to hadoop and hbase, and I am having a hard time > here > > figuring out some issues with my cluster, and I am pretty sure many of > you > > have gone through many of the problems I am facing right now. I need some > > help in figuring out what exactly are the bottlenecks in my system. I > have > > set up regular ganglia on my cluster (simple ganglia, not able to track > > hadoop/hbase metrics yet.. that's another issue). What are the stats that > > matter the most? How to go about making inferences from these reports? I > > know that swapping is a very important parameter to monitor. What are the > > other important parameters, what is their significance, and what should > be > > their values ideally be, approximately? Mainly memory cached, cpu loads, > > memory buffered, Total memory, Network usage etc. and also any other > > parameter that you found to be useful in these cases. I think this would > be > > very helpful for many people in figuring out many issues. Thanks a ton, > > > > hari > > >
