Here is region server log: http://pastebin.com/6d1QgNXR
There was only one exception around the time I performed get: 2010-03-13 03:38:30,325 INFO [regionserver/10.10.31.137:60020.worker] regionserver.HRegion(342): region domaincrawltable,com.intensedebate.s:http\x2Fsignup,1268098564908/602293480 available; sequence id is 225656738 2010-03-13 03:38:30,476 ERROR [IPC Server handler 23 on 60020] regionserver.HRegionServer(844): org.apache.hadoop.hbase.NotServingRegionException: ruletable,com.hoovers.www/companyindex/New_Mexico/Corrales/Telecommunications-1.html,1268084284374 at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2307) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1784) at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915) Here is master server log which doesn't have exception at all: http://pastebin.com/80949RK2 On Sat, Mar 13, 2010 at 10:10 AM, Jonathan Gray <jl...@streamy.com> wrote: > Ted, > > > > Your attachments didn't come through. Try putting them up on the web or > pastebin somewhere. > > > > What's happening in the RegionServer logs between the time that the get > works and the get doesn't work? > > > > Also, I recommend upgrading to 0.20.3, there are critical fixes. > > > > From: Ted Yu [mailto:yuzhih...@gmail.com] > Sent: Saturday, March 13, 2010 3:48 AM > To: hbase-user@hadoop.apache.org > Subject: Re: slow response in hbase shell > > > > We use hbase 0.20.1 > There are 3 region servers. 13 regions on each server. > > I don't see any exception in master log. > > I was able to run 10 successful get commands before hitting the following: > I am attaching (partial) master log and region server log from 10.10.31.137 > > hbase(main):014:0> get 'ruletable', 'com.about.acne' > COLUMN CELL > lpm_1.0:category timestamp=1268347483823, > value=http://acne.about.com\t9002:0.86580086\thost\n > 1 row(s) in 0.0120 seconds > hbase(main):015:0> get 'ruletable', 'com.about.acne' > NativeException: org.apache.hadoop.hbase.client.RetriesExhaustedException: > Trying to contact region server 10.10.31.137:60020 for region > ruletable,,1268083966723, row 'com.about.acne', but failed after 5 > attempts. > Exceptions: > org.apache.hadoop.hbase.NotServingRegionException: > org.apache.hadoop.hbase.NotServingRegionException: ruletable,,1268083966723 > at > > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.j > ava:2307) > at > > org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:17 > 84) > at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl > .java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915) > > > > On Fri, Mar 12, 2010 at 9:08 PM, Jonathan Gray <jl...@streamy.com> wrote: > > Seems like something weird is going on with your regionservers and > balancing. > > Can you post big snippets from the regionserver and master logs? Have you > checked to see what's going on? Is there repetitive balancing going on > that > never seems to reach steady-state? > > How many regions and how many nodes on which version of HBase? > > > > -----Original Message----- > > From: Ted Yu [mailto:yuzhih...@gmail.com] > > Sent: Friday, March 12, 2010 8:24 PM > > To: hbase-user@hadoop.apache.org > > Subject: slow response in hbase shell > > > > Hi, > > > > > We sometimes saw over 5 second delay running get in hbase 0.20.1 > > shell: > > > > > > hbase(main):002:0> get 'ruletable', 'ca.tsn.www' > > > 0 row(s) in 10.1330 seconds > > > > > > From our 3 region servers there are a lot of such messages: > > > > > > > > > > 2010-03-12 00:00:00,996 INFO [regionserver/10.10.31.135:60020] > > > regionserver.HRegionServer(493): MSG_REGION_CLOSE: > > > > > crawltable,com.pandora.www:http\x2Finclude\x2FlyricsAdEmbed.html\x3Fgen > > re\x3Delectronica\x26artist\x3DR273847\x26webname\x3D\x26sz\x3D2000x8\x > > 26ord\x3D125823029226371645\x26tile\x3D3\x26_id\x3Dbottom_leaderboard_c > > ontainer,1266944566406: > > > Overloaded > > > 2010-03-12 00:00:00,997 INFO [regionserver/10.10.31.135:60020] > > > regionserver.HRegionServer(493): MSG_REGION_CLOSE: > > > domaincrawltable,,1268098564908: Overloaded > > > > > > grep Overloaded hbase-hbaseadmin-regionserver.log | wc > > > 40428 343638 11523230 > > > > > > grep Overloaded hbase-hbaseadmin-regionserver.log | wc > > > 40430 343655 11307703 > > > > > > grep Overloaded hbase-hbaseadmin-regionserver.log | wc > > > 40466 343961 11340379 > > > > > > Was the slow response due to the load balancing ? > > > > > > > The strange thing was that after several quick responses I would see: > > > > hbase(main):004:0> get 'ruletable', 'com.about.acne' > > COLUMN > > CELL > > lpm_1.0:category timestamp=1268347483823, value= > > http://acne.about.com\t9002:0.86580 086\thost\n > > 1 row(s) in 0.0040 seconds > > hbase(main):005:0> get 'ruletable', 'com.about.acne' > > COLUMN > > CELL > > lpm_1.0:category timestamp=1268347483823, value= > > http://acne.about.com\t9002:0.86580 086\thost\n > > 1 row(s) in 0.0040 seconds > > hbase(main):006:0> get 'ruletable', 'com.about.acne' > > NativeException: > > org.apache.hadoop.hbase.client.RetriesExhaustedException: > > Trying to contact region server 10.10.31.136:60020 for region > > ruletable,,1268083966723, row 'com.about.acne', but failed after 5 > > attempts. > > Exceptions: > > org.apache.hadoop.hbase.NotServingRegionException: > > org.apache.hadoop.hbase.NotServingRegionException: > > ruletable,,1268083966723 > > at > > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionSer > > ver.java:2307) > > at > > org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.ja > > va:1784) > > at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source) > > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso > > rImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > > at > > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648) > > at > > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:91 > > 5) > > > > But 006:60010/master.jsp refreshes quickly and shows all three > > regionservers. > > Don't know why hbase shell encountered NSRE. > > > > > > > > Thanks > > > >