Re: Stats to look out for while running mapreduce jobs with HBase

Oleg Ruchovets Fri, 12 Nov 2010 12:51:29 -0800

It is interesting point
 I am entering to hbase using m/r

1) map phase consumes 100% cpu
2) reducer phase (hbase insertions) consumes ~ 15%


As I understand there are a lot of improvement  could done here

Can some recommend me what is the monitoring  tool that is good to monitor
map/reduce jobs.

 Thanks
Oleg.

On Fri, Nov 12, 2010 at 8:56 PM, Jean-Daniel Cryans <[email protected]>wrote:

> The most important:
>
>  - no swap, as is zero, none, nada
>  - near 0 io wait
>
> Then it's about making sure that you can drive your user CPU to near
> 100%. If you can't, then you have a bottle neck somewhere and there's
> no magical way of finding it out. It usually starts by understanding
> what you're doing (is your job mostly just mapping or it's inserting
> aggressively?) and then figuring via debugging or log reading what
> seems to be the holdup.
>
> J-D
>
> On Thu, Nov 11, 2010 at 9:05 PM, Hari Sreekumar
> <[email protected]> wrote:
> > Hi,
> >
> >       I am quite new to hadoop and hbase, and I am having a hard time
> here
> > figuring out some issues with my cluster, and I am pretty sure many of
> you
> > have gone through many of the problems I am facing right now. I need some
> > help in figuring out what exactly are the bottlenecks in my system. I
> have
> > set up regular ganglia on my cluster (simple ganglia, not able to track
> > hadoop/hbase metrics yet.. that's another issue). What are the stats that
> > matter the most? How to go about making inferences from these reports? I
> > know that swapping is a very important parameter to monitor. What are the
> > other important parameters, what is their significance, and what should
> be
> > their values ideally be, approximately? Mainly memory cached, cpu loads,
> > memory buffered, Total memory, Network usage etc. and also any other
> > parameter that you found to be useful in these cases. I think this would
> be
> > very helpful for many people in figuring out many issues. Thanks a ton,
> >
> > hari
> >
>

Re: Stats to look out for while running mapreduce jobs with HBase

Reply via email to