Ted,

It's hard to tell without having full logs, but it seems that this is caught
in a reassignment loop.  I see things getting unassigned and reassigned over
and over.

Are there any weird network or dns issues on this cluster?  It shows avg
load at 7 not 13, so seems that there could be some extra regionserver or
something else wonky.

Have you tried restarting your cluster?

> -----Original Message-----
> From: Ted Yu [mailto:yuzhih...@gmail.com]
> Sent: Saturday, March 13, 2010 1:59 PM
> To: hbase-user@hadoop.apache.org
> Subject: Re: slow response in hbase shell
> 
> Here is region server log:
> http://pastebin.com/6d1QgNXR
> 
> There was only one exception around the time I performed get:
> 2010-03-13 03:38:30,325 INFO  [regionserver/10.10.31.137:60020.worker]
> regionserver.HRegion(342): region
> domaincrawltable,com.intensedebate.s:http\x2Fsignup,1268098564908/60229
> 3480
> available; sequence id is 225656738
> 2010-03-13 03:38:30,476 ERROR [IPC Server handler 23 on 60020]
> regionserver.HRegionServer(844):
> org.apache.hadoop.hbase.NotServingRegionException:
> ruletable,com.hoovers.www/companyindex/New_Mexico/Corrales/Telecommunic
> ations-1.html,1268084284374
>     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionSer
> ver.java:2307)
>     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.ja
> va:1784)
>     at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso
> rImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
>     at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:91
> 5)
> 
> Here is master server log which doesn't have exception at all:
> http://pastebin.com/80949RK2
> 
> On Sat, Mar 13, 2010 at 10:10 AM, Jonathan Gray <jl...@streamy.com>
> wrote:
> 
> > Ted,
> >
> >
> >
> > Your attachments didn't come through.  Try putting them up on the web
> or
> > pastebin somewhere.
> >
> >
> >
> > What's happening in the RegionServer logs between the time that the
> get
> > works and the get doesn't work?
> >
> >
> >
> > Also, I recommend upgrading to 0.20.3, there are critical fixes.
> >
> >
> >
> > From: Ted Yu [mailto:yuzhih...@gmail.com]
> > Sent: Saturday, March 13, 2010 3:48 AM
> > To: hbase-user@hadoop.apache.org
> > Subject: Re: slow response in hbase shell
> >
> >
> >
> > We use hbase 0.20.1
> > There are 3 region servers. 13 regions on each server.
> >
> > I don't see any exception in master log.
> >
> > I was able to run 10 successful get commands before hitting the
> following:
> > I am attaching (partial) master log and region server log from
> 10.10.31.137
> >
> > hbase(main):014:0> get 'ruletable', 'com.about.acne'
> > COLUMN                       CELL
> >  lpm_1.0:category            timestamp=1268347483823,
> > value=http://acne.about.com\t9002:0.86580086\thost\n
> > 1 row(s) in 0.0120 seconds
> > hbase(main):015:0> get 'ruletable', 'com.about.acne'
> > NativeException:
> org.apache.hadoop.hbase.client.RetriesExhaustedException:
> > Trying to contact region server 10.10.31.137:60020 for region
> > ruletable,,1268083966723, row 'com.about.acne', but failed after 5
> > attempts.
> > Exceptions:
> > org.apache.hadoop.hbase.NotServingRegionException:
> > org.apache.hadoop.hbase.NotServingRegionException:
> ruletable,,1268083966723
> >        at
> >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionSer
> ver.j
> > ava:2307)
> >        at
> >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.ja
> va:17
> > 84)
> >        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
> >        at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso
> rImpl
> > .java:25)
> >        at java.lang.reflect.Method.invoke(Method.java:597)
> >        at
> > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
> >        at
> >
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:91
> 5)
> >
> >
> >
> > On Fri, Mar 12, 2010 at 9:08 PM, Jonathan Gray <jl...@streamy.com>
> wrote:
> >
> > Seems like something weird is going on with your regionservers and
> > balancing.
> >
> > Can you post big snippets from the regionserver and master logs?
> Have you
> > checked to see what's going on?  Is there repetitive balancing going
> on
> > that
> > never seems to reach steady-state?
> >
> > How many regions and how many nodes on which version of HBase?
> >
> >
> > > -----Original Message-----
> > > From: Ted Yu [mailto:yuzhih...@gmail.com]
> > > Sent: Friday, March 12, 2010 8:24 PM
> > > To: hbase-user@hadoop.apache.org
> > > Subject: slow response in hbase shell
> > >
> > > Hi,
> > >
> > > > We sometimes saw over 5 second delay running get in hbase 0.20.1
> > > shell:
> > > >
> > > > hbase(main):002:0> get 'ruletable', 'ca.tsn.www'
> > > > 0 row(s) in 10.1330 seconds
> > > >
> > > > From our 3 region servers there are a lot of such messages:
> > > >
> > >
> > >
> > > > 2010-03-12 00:00:00,996 INFO  [regionserver/10.10.31.135:60020]
> > > > regionserver.HRegionServer(493): MSG_REGION_CLOSE:
> > > >
> > >
> crawltable,com.pandora.www:http\x2Finclude\x2FlyricsAdEmbed.html\x3Fgen
> > >
> re\x3Delectronica\x26artist\x3DR273847\x26webname\x3D\x26sz\x3D2000x8\x
> > >
> 26ord\x3D125823029226371645\x26tile\x3D3\x26_id\x3Dbottom_leaderboard_c
> > > ontainer,1266944566406:
> > > > Overloaded
> > > > 2010-03-12 00:00:00,997 INFO  [regionserver/10.10.31.135:60020]
> > > > regionserver.HRegionServer(493): MSG_REGION_CLOSE:
> > > > domaincrawltable,,1268098564908: Overloaded
> > > >
> > > > grep Overloaded hbase-hbaseadmin-regionserver.log | wc
> > > >   40428  343638 11523230
> > > >
> > > > grep Overloaded hbase-hbaseadmin-regionserver.log | wc
> > > >   40430  343655 11307703
> > > >
> > > > grep Overloaded hbase-hbaseadmin-regionserver.log | wc
> > > >   40466  343961 11340379
> > > >
> > > > Was the slow response due to the load balancing ?
> > > >
> > >
> > > The strange thing was that after several quick responses I would
> see:
> > >
> > > hbase(main):004:0> get 'ruletable', 'com.about.acne'
> > > COLUMN
> > > CELL
> > >  lpm_1.0:category            timestamp=1268347483823, value=
> > > http://acne.about.com\t9002:0.86580       086\thost\n
> > > 1 row(s) in 0.0040 seconds
> > > hbase(main):005:0> get 'ruletable', 'com.about.acne'
> > > COLUMN
> > > CELL
> > >  lpm_1.0:category            timestamp=1268347483823, value=
> > > http://acne.about.com\t9002:0.86580       086\thost\n
> > > 1 row(s) in 0.0040 seconds
> > > hbase(main):006:0> get 'ruletable', 'com.about.acne'
> > > NativeException:
> > > org.apache.hadoop.hbase.client.RetriesExhaustedException:
> > > Trying to contact region server 10.10.31.136:60020 for region
> > > ruletable,,1268083966723, row 'com.about.acne', but failed after 5
> > > attempts.
> > > Exceptions:
> > > org.apache.hadoop.hbase.NotServingRegionException:
> > > org.apache.hadoop.hbase.NotServingRegionException:
> > > ruletable,,1268083966723
> > >         at
> > >
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionSer
> > > ver.java:2307)
> > >         at
> > >
> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.ja
> > > va:1784)
> > >         at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown
> Source)
> > >         at
> > >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso
> > > rImpl.java:25)
> > >         at java.lang.reflect.Method.invoke(Method.java:597)
> > >         at
> > > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
> > >         at
> > >
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:91
> > > 5)
> > >
> > > But 006:60010/master.jsp refreshes quickly and shows all three
> > > regionservers.
> > > Don't know why hbase shell encountered NSRE.
> > >
> > > >
> > > > Thanks
> >
> >
> >
> >

Reply via email to