Meanwhile, I was able to roughly estimate which table is getting traffic by executing the following commands:
1) Store ngrep output in a file (for few seconds) ngrep -W byline port 60020 > temp.out 2) Find out all the tables that region server has from HBase user interface. For each table execute the following commands: grep 'TableName,' temp.out | wc -l This was enough for us as even which table was getting hit would be very useful information for us. I am guessing there should be a way to grep region name too. Regards, Vaibhav, GumGum On Wed, Nov 17, 2010 at 9:22 AM, Jean-Daniel Cryans <[email protected]>wrote: > AFAIK most monitoring systems don't like dynamically-named metrics, > for example in ganglia you would end up with an ever growing number of > metrics for req/regions (one for each region that the region server > ever had). At the very least it should be included in the region > server report so that the master can take action and plan accordingly, > the new master has better facilities for that. > > J-D > > On Wed, Nov 17, 2010 at 8:15 AM, Lars George <[email protected]> > wrote: > > JD, > > > > Should we create a metric for it so that it dynamically counts per > > region its usage? That can then be exposed via Ganglia context or JMX. > > Just wondering. > > > > Lars > > > > On Wed, Nov 17, 2010 at 5:04 PM, Vaibhav Puranik <[email protected]> > wrote: > >> hi, > >> > >> Thanks for the suggestions JD & Michael. > >> The region servers serving ROOT & META regions are fine. > >> > >> I will try analysing tcpdump output. > >> > >> Regards, > >> Vaibhav > >> GumGum > >> > >> > >> > >> On Tue, Nov 16, 2010 at 7:15 AM, Michael Segel < > [email protected]>wrote: > >> > >>> > >>> Beyond this... which region is serving your ROOT and meta data? > >>> > >>> That node will probably get a higher load. > >>> Also, how many disks do you have and how many nodes? > >>> You could see higher CPU loads if you're I/O bound. > >>> > >>> > Date: Mon, 15 Nov 2010 18:24:31 -0800 > >>> > Subject: Re: Correlating traffic with regions > >>> > From: [email protected] > >>> > To: [email protected] > >>> > > >>> > Yeah this is one area where HBase could do a much better job... > >>> > because there's not really a way to do it within the database. One > >>> > thing you can do is to tcpdump a few seconds of traffic on that node > >>> > and decipher which tables (shown in the region name) are being used. > >>> > > >>> > J-D > >>> > > >>> > On Mon, Nov 15, 2010 at 5:17 PM, Vaibhav Puranik <[email protected] > > > >>> wrote: > >>> > > Hi all, > >>> > > > >>> > > We are running 0.20.6 in production. > >>> > > > >>> > > On one of our nodes, we are seeing CPU (all 8 CPUS) hovering near > 60%. > >>> But > >>> > > the node has many tables and many regions on it. > >>> > > > >>> > > Is there an easy way to find out which of these regions or tables > are > >>> > > getting most of the traffic? > >>> > > > >>> > > Regards, > >>> > > Vaibhav Purnaik > >>> > > GumGum > >>> > > > >>> > >>> > >> > > >
