Thanks for your help, Stack. I redo the test and found something a little
different.
com.yahoo.ycsb.DBException:
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact
region server Some server, retryOnlyOne=true, index=0, islastrow=false,
tries=9, numtries=10, i=1416, listsize=4696,
region=usertable,user1569171403,1292206949699 for region
usertable,user1569171403,1292206949699, row 'user1569963955', but failed
after 10 attempts.
Exceptions:
at com.yahoo.ycsb.db.HBaseClient.cleanup(Unknown Source)
at com.yahoo.ycsb.DBWrapper.cleanup(Unknown Source)
at com.yahoo.ycsb.ClientThread.run(Unknown Source)
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying
to contact region server Some server, retryOnlyOne=true, index=0,
islastrow=false, tries=9, numtries=10, i=1416, listsize=4696,
region=usertable,user1569171403,1292206949699 for region
usertable,user1569171403,1292206949699, row 'user1569963955', but failed
after 10 attempts.
Exceptions:
at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1157)
at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1238)
at
org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:666)
... 3 more
I tried Get in hbase shell for row user1569963955, it says,
org.apache.hadoop.hbase.NotServingRegionException:
org.apache.hadoop.hbase.NotServingRegionException:
usertable,user1569171403,1292206949699
at
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2269)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1732)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
I grep the region id 1292206949699 on master node. Here is the log.
http://pastebin.com/frRpUg92
And corresponding logs in regionserver logs:
http://pastebin.com/eUx92J5r
I see region "usertable,user1569171403,1292206949699" is a daughter region.
And I found
there was a very long time compaction on this region. I do not know whether
this was the root cause.
Besides, I checked the GC log and found there are no unnormal problems.
10.1.0.18:
/home/tao/hbase/logs/hbase-tao-regionserver-sr118.log:3549:2010-12-13
10:22:25,846 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction
completed on region usertable,user1569171403,1292206949699 in 17sec
2010/12/12 Stack <[email protected]>
> That region was offline a while probably because it was taking a while
> for split daughters to come online. If you do a Get in shell for the
> start row, is it available now? Grep in master for 1291950247501.
> See if you can figure out a history of regions carrying this row. See
> if you see something like a region split that perhaps was taking a
> while. Go take a look at the regionserver log where split was
> happening (The master in its logs should report the location of the
> plit). What was going on on this region? Was it struggling? GC
> pause? Swapping?
>
> NotServingRegionException can be part of 'normal' operation. The
> server will throw the client this exception as signal that it should
> recalibrate -- i.e. go back to .META. to find a regions new location
> -- because the region has moved because of split or crash, etc. if
> the NSREs go on for too long, they turn from DEBUG into ERRORs and
> client fails. If loading on cluster is intensive, it can take a while
> for regions to re-online. There could be another issue in the way of
> the region re-onlining. Grepping around in the logs as per above
> should give a clue.
>
> St.Ack
>
> On Thu, Dec 9, 2010 at 10:00 PM, Tao Xie <[email protected]> wrote:
> > hi, all
> >
> > I met this exception when I doing intensive insertions using YCSB.
> Anybody
> > give me some clues on this? I use hbase 0.20.6.
> >
> > com.yahoo.ycsb.DBException:
> > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
> contact
> > region server -- nothing found, no 'location' returned,
> > tableName=usertable, reload=true -- for region , row 'user1001412274',
> but
> > failed after 11 attempts.
> > Exceptions:
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> >
> > at com.yahoo.ycsb.db.HBaseClient.cleanup(Unknown Source)
> > at com.yahoo.ycsb.DBWrapper.cleanup(Unknown Source)
> > at com.yahoo.ycsb.ClientThread.run(Unknown Source)
> > Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException:
> Trying
> > to contact region server -- nothing found, no 'location' returned,
> > tableName=usertable, reload=true -- for region , row 'user1001412274',
> but
> > failed after 11 attempts.
> > Exceptions:
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address
> > listed in .META. for region usertable,,1291950247501
> >
> > at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionLocationForRowWithRetries(HConnectionManager.java:1095)
> > at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.access$200(HConnectionManager.java:240)
> > at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.getRegionName(HConnectionManager.java:1191)
> > at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1168)
> > at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1238)
> > at
> > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:666)
> > ... 3 more
> >
>