Jonathan: We saw similar issue using HBASE 0.20.6 with HBASE-2473 Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: No server address listed in .META. for region HB_INC_POST_0818-ERROR_SAMPLES-1282193650093,,1282193650831 at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:726) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:634) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:601) at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:244) at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:206) at com.carrieriq.m2m.platform.mmp2.input.StripedHBaseTable.createIfNeeded(StripedHBaseTable.java:470) ... 11 more
I assume the region has to be open otherwise locateRegionInMeta() call would fail After restarting HBase, I see: 2010-08-19 05:08:00,565 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN: HB_INC_POST_0818-ERROR_SAMPLES-1282193650093,,1282193650831 Cheers On Mon, Jun 14, 2010 at 4:25 PM, Jonathan Gray <[email protected]> wrote: > Can you post the log from the regionserver that did not ever open the > region (from 12:57 to 13:14)? And actually grab it from a couple minutes > before 12:57. > > Most likely this is not a bug as much as a current limitation of handling > open/close messages sequentially. It's possible that a long-running close > (flush) held up processing of the open. The logs will say more. > > This should be much improved with the major release of HBase. > > JG > > > -----Original Message----- > > From: Jinsong Hu [mailto:[email protected]] > > Sent: Monday, June 14, 2010 11:24 AM > > To: [email protected] > > Subject: bug report: opening hbase region takes too long , making the > > region not available for more than 10 minutes. > > > > > > > > Hi, There: > > > > I have found an hbase bug related to openning region takes too long. > > The > > client reported error of no server address. For the region > > MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a3c3c044bd1db885f1523,1276457581773, here is > > the > > sequence: > > > > > > > > Around 12:57, all 8 region servers closed this region. > > On machine2037, at 12:57:45,812 , it received a request to open this > > region. Usually, a worker thread will immediately honor the request > > and > > open this region within seconds, but in this case, the region wasn't > > open > > until 13:14:43,341 . > > Around 13:16, all other regionservers received requests to open this > > region > > , and worker thread immediately opened them . > > > > > > So during this time time gap from 12:57 to 13:14, the region is not > > available. And the client logs error while trying to insert the > > records. > > > > > > > > I have read the hbase code. The way the hbase solves this problem is by > > retrying 10 times, waiting 10 seconds in between. Essentially it tries > > for > > 100 seconds. > > > > In this case, even that 100 seconds retrial won't work at 12:10. > > because the > > region was opened way beyond 100 second interval. > > > > > > > > This is clearly an hbase bug. > > > > > > Jimmy> > > > > > > > > > > Here is the client side log: > > > > 13:10:03,441 INFO [ClientCnxn] Attempting connection to server > > zookeeper2.cloud.mydomain.net/10.110.8 52:2181: No server address > > listed in > > .META. for region MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a3c3c044bd1db885f1523,1276457581773 > > > > > > > > 13:10:03,451 INFO [ClientCnxn] Server connection successful > > > > org.apache.hadoop.hbase.client.NoServerForRegionException: No server > > address > > listed in .META. for r gion MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a3c3c044bd1db885f1523,1276457581773 > > > > > > > > > > > > here are the regionserver side log related to this issue. > > > > > > machine2035: > > > > 2010-06-14 12:57:23,452 INFO > > org.apache.hadoop.hbase.regionserver.HRegion: > > Close > > > > d MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a3c3c044bd1db885f1523,127 > > > > 6457581773 > > > > 2010-06-14 13:16:37,333 INFO > > org.apache.hadoop.hbase.regionserver.HRegionServer: > > > > MSG_REGION_OPEN: MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a3c3c044b > > > > d1db885f1523,1276457581773 > > > > 2010-06-14 13:16:37,333 INFO > > org.apache.hadoop.hbase.regionserver.HRegionServer: > > > > Worker: MSG_REGION_OPEN: MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a > > > > 3c3c044bd1db885f1523,1276457581773 > > > > > > > > > > > > machine2036: > > > > 2010-06-14 12:57:29,312 INFO > > org.apache.hadoop.hbase.regionserver.HRegion: > > Close > > > > d MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a3c3c044bd1db885f1523,127 > > > > 6457581773 > > > > 2010-06-14 13:16:05,107 INFO > > org.apache.hadoop.hbase.regionserver.HRegionServer: > > > > MSG_REGION_OPEN: MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a3c3c044b > > > > d1db885f1523,1276457581773 > > > > 2010-06-14 13:16:05,107 INFO > > org.apache.hadoop.hbase.regionserver.HRegionServer: > > > > Worker: MSG_REGION_OPEN: MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a > > > > 3c3c044bd1db885f1523,1276457581773 > > > > > > > > > > > > > > > > machine2037 > > > > 2010-06-14 12:57:09,986 INFO > > org.apache.hadoop.hbase.regionserver.HRegion: > > Close > > > > d MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a3c3c044bd1db885f1523,127 > > > > 6457581773 > > > > 2010-06-14 12:57:45,812 INFO > > org.apache.hadoop.hbase.regionserver.HRegionServer: > > > > MSG_REGION_OPEN: MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a3c3c044b > > > > d1db885f1523,1276457581773 > > > > 2010-06-14 13:14:43,341 INFO > > org.apache.hadoop.hbase.regionserver.HRegionServer: > > > > Worker: MSG_REGION_OPEN: MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a > > > > 3c3c044bd1db885f1523,1276457581773 > > > > > > > > > > > > > > > > machine2038 > > > > > > > > 2010-06-14 12:57:25,562 INFO > > org.apache.hadoop.hbase.regionserver.HRegion: > > Close > > > > d MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a3c3c044bd1db885f1523,127 > > > > 6457581773 > > > > 2010-06-14 13:15:53,356 INFO > > org.apache.hadoop.hbase.regionserver.HRegionServer: > > > > MSG_REGION_OPEN: MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a3c3c044b > > > > d1db885f1523,1276457581773 > > > > 2010-06-14 13:15:53,356 INFO > > org.apache.hadoop.hbase.regionserver.HRegionServer: > > > > Worker: MSG_REGION_OPEN: MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a > > > > 3c3c044bd1db885f1523,1276457581773 > > > > > > > > > > > > machine2040: > > > > 2010-06-14 12:57:14,214 INFO > > org.apache.hadoop.hbase.regionserver.HRegion: > > Close > > > > d MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a3c3c044bd1db885f1523,127 > > > > 6457581773 > > > > 2010-06-14 13:15:01,266 INFO > > org.apache.hadoop.hbase.regionserver.HRegionServer: > > > > MSG_REGION_OPEN: MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a3c3c044b > > > > d1db885f1523,1276457581773 > > > > 2010-06-14 13:15:01,266 INFO > > org.apache.hadoop.hbase.regionserver.HRegionServer: > > > > Worker: MSG_REGION_OPEN: MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a > > > > 3c3c044bd1db885f1523,1276457581773 > > > > > > > > > > > > > > > > machine2041 > > > > 2010-06-14 12:57:44,877 INFO > > org.apache.hadoop.hbase.regionserver.HRegion: > > Close > > > > d MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a3c3c044bd1db885f1523,127 > > > > 6457581773 > > > > 2010-06-14 13:15:48,955 INFO > > org.apache.hadoop.hbase.regionserver.HRegionServer: > > > > MSG_REGION_OPEN: MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a3c3c044b > > > > d1db885f1523,1276457581773 > > > > 2010-06-14 13:15:48,955 INFO > > org.apache.hadoop.hbase.regionserver.HRegionServer: > > > > Worker: MSG_REGION_OPEN: MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a > > > > 3c3c044bd1db885f1523,1276457581773 > > > > > > > > machine2042: > > > > 2010-06-14 12:57:12,500 INFO > > org.apache.hadoop.hbase.regionserver.HRegion: > > Close > > > > d MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a3c3c044bd1db885f1523,127 > > > > 6457581773 > > > > 2010-06-14 13:14:58,719 INFO > > org.apache.hadoop.hbase.regionserver.HRegionServer: > > > > MSG_REGION_OPEN: MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a3c3c044b > > > > d1db885f1523,1276457581773 > > > > 2010-06-14 13:14:58,719 INFO > > org.apache.hadoop.hbase.regionserver.HRegionServer: > > > > Worker: MSG_REGION_OPEN: MyOwnEventTable,2010-06-13 > > 10:33:31\x0922f3563bd43a > > > > 3c3c044bd1db885f1523,1276457581773 > > > > > > > > > > > >
