I was my fault running 0.20.5 on the same cluster as 0.20.6 Switching back to 0.20.6.jar on all servers solved the problem.
On Tue, Nov 23, 2010 at 10:52 AM, Jean-Daniel Cryans <jdcry...@apache.org>wrote: > So before that first line... everything was fine? What happened there? > > On Tue, Nov 23, 2010 at 10:48 AM, Ted Yu <yuzhih...@gmail.com> wrote: > > Here is relevant portion of master log: > > http://pastebin.com/HiembAxc > > > > On Tue, Nov 23, 2010 at 10:37 AM, Jean-Daniel Cryans < > jdcry...@apache.org>wrote: > > > >> Can you dig up to where it started doing that? > >> > >> On Tue, Nov 23, 2010 at 10:27 AM, Ted Yu <yuzhih...@gmail.com> wrote: > >> > http://pastebin.com/E86iPnK4 > >> > > >> > On Tue, Nov 23, 2010 at 10:23 AM, Jean-Daniel Cryans < > >> jdcry...@apache.org>wrote: > >> > > >> >> Is that really from the master log? Can we get the full log in a > >> pastebin? > >> >> > >> >> J-D > >> >> > >> >> On Tue, Nov 23, 2010 at 7:40 AM, Ted Yu <yuzhih...@gmail.com> wrote: > >> >> > I backed up zookeeper dataDir to another location. > >> >> > After clearing zookeeper dataDir, HMaster still couldn't start: > >> >> > > >> >> > 2010-11-23 15:30:56,095 DEBUG > >> >> > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode > >> >> /hbase/master > >> >> > got 10.202.50.100:60000 > >> >> > 2010-11-23 15:30:56,119 DEBUG > >> >> > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed to read: > >> >> > org.apache.zookeeper.KeeperException$NoNodeException: > KeeperErrorCode > >> = > >> >> > NoNode for /hbase/root-region-server > >> >> > 2010-11-23 15:30:56,120 DEBUG > >> >> > org.apache.hadoop.hbase.client.HConnectionManager$TableServers: > >> Sleeping > >> >> > 5000ms, waiting for root region. > >> >> > 2010-11-23 15:31:01,125 DEBUG > >> >> > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed to read: > >> >> > org.apache.zookeeper.KeeperException$NoNodeException: > KeeperErrorCode > >> = > >> >> > NoNode for /hbase/root-region-server > >> >> > 2010-11-23 15:31:01,125 DEBUG > >> >> > org.apache.hadoop.hbase.client.HConnectionManager$TableServers: > >> Sleeping > >> >> > 5000ms, waiting for root region. > >> >> > > >> >> > Disk isn't full: > >> >> > /dev/md2 2786058952 186234928 2456017408 8% / > >> >> > > >> >> > Comment is appreciated. > >> >> > > >> >> > On Tue, Nov 23, 2010 at 5:34 AM, Ted Yu <yuzhih...@gmail.com> > wrote: > >> >> > > >> >> >> I tried to restart hbase. But the region server identified by > ZNode > >> >> >> /hbase/root-region-server declared that it is not serving root > >> region: > >> >> >> > >> >> >> 2010-11-23 13:26:49,617 DEBUG > >> >> >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: > >> >> >> locateRegionInMeta attempt 1 of 3 failed; retrying after sleep of > >> 5000 > >> >> >> because: Timed out trying to locate root region because: > >> >> >> org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0 > >> >> >> at > >> >> >> > >> >> > >> > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2274) > >> >> >> at > >> >> >> > >> >> > >> > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionInfo(HRegionServer.java:1711) > >> >> >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > >> Method) > >> >> >> at > >> >> >> > >> >> > >> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > >> >> >> at > >> >> >> > >> >> > >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > >> >> >> at java.lang.reflect.Method.invoke(Method.java:597) > >> >> >> at > >> >> >> > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657) > >> >> >> at > >> >> >> > >> >> > >> > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:998) > >> >> >> > >> >> >> 2010-11-23 13:26:54,622 DEBUG > >> >> >> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode > >> >> >> /hbase/root-region-server got 10.202.50.111:60020 > >> >> >> 2010-11-23 13:26:54,624 DEBUG > >> >> >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: > Root > >> >> region > >> >> >> location changed. Sleeping. > >> >> >> 2010-11-23 13:26:59,626 DEBUG > >> >> >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: > Wake. > >> >> Retry > >> >> >> finding root region. > >> >> >> 2010-11-23 13:26:59,629 DEBUG > >> >> >> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode > >> >> >> /hbase/root-region-server got 10.202.50.111:60020 > >> >> >> 2010-11-23 13:26:59,630 DEBUG > >> >> >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: > Root > >> >> region > >> >> >> location changed. Sleeping. > >> >> >> 2010-11-23 13:27:04,632 DEBUG > >> >> >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: > Wake. > >> >> Retry > >> >> >> finding root region. > >> >> >> 2010-11-23 13:27:04,635 DEBUG > >> >> >> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode > >> >> >> /hbase/root-region-server got 10.202.50.111:60020 > >> >> >> > >> >> >> What should I do next ? > >> >> >> > >> >> >> Thanks > >> >> >> > >> >> >> > >> >> >> On Tue, Nov 23, 2010 at 1:37 AM, Lars George < > lars.geo...@gmail.com > >> >> >wrote: > >> >> >> > >> >> >>> Hi Ted, > >> >> >>> > >> >> >>> So one of the regions is not being released? Could you try and > see > >> >> >>> from .META. which is still deployed and use the shell's > >> "close_region" > >> >> >>> to close it while looking at the master and region server logs to > >> see > >> >> >>> what is going on? Maybe best if you switch the RS to DEBUG level > >> >> >>> logging first to get some info? > >> >> >>> > >> >> >>> Lars > >> >> >>> > >> >> >>> On Tue, Nov 23, 2010 at 8:25 AM, Ted Yu <yuzhih...@gmail.com> > >> wrote: > >> >> >>> > Hi > >> >> >>> > We use 0.20.6 > >> >> >>> > > >> >> >>> > I tried to disable packageindex table. From master log: > >> >> >>> > > >> >> >>> > 2010-11-23 07:21:06,326 DEBUG > >> >> >>> > org.apache.hadoop.hbase.master.ChangeTableState: Adding region > >> >> >>> > packageindex,CC7E6FEA4CDCF19C6F4AC9BB51EF6A33,1290230596786 to > >> >> >>> setClosing > >> >> >>> > list for us01-ciqps1-grid10.carrieriq.com,60020,1290493641949 > >> >> >>> > 2010-11-23 07:21:06,326 DEBUG > >> >> >>> > org.apache.hadoop.hbase.master.ChangeTableState: Adding region > >> >> >>> > packageindex,F2A18967F48C9FDA9C23BF9A8210ED17,1290230394345 to > >> >> >>> setClosing > >> >> >>> > list for us01-ciqps1-grid11.carrieriq.com,60020,1290493641228 > >> >> >>> > 2010-11-23 07:21:06,326 DEBUG > >> >> >>> > org.apache.hadoop.hbase.master.ChangeTableState: Adding region > >> >> >>> > packageindex,E8FA713B2F030EF012E5AB0A641CB1DB,1290230356969 to > >> >> >>> setClosing > >> >> >>> > list for us01-ciqps1-grid11.carrieriq.com,60020,1290493641228 > >> >> >>> > 2010-11-23 07:21:06,327 DEBUG > >> >> >>> > org.apache.hadoop.hbase.master.ChangeTableState: Adding region > >> >> >>> > packageindex,5B10CA26DCAEFBFF4A63DB7D0432D628,1290229869191 to > >> >> >>> setClosing > >> >> >>> > list for us01-ciqps1-grid12.carrieriq.com,60020,1290493641232 > >> >> >>> > 2010-11-23 07:21:20,178 INFO > >> >> >>> org.apache.hadoop.hbase.master.ServerManager: > >> >> >>> > 15 region servers, 0 dead, average load 123.66666666666667 > >> >> >>> > 2010-11-23 07:21:20,252 INFO > >> >> org.apache.hadoop.hbase.master.BaseScanner: > >> >> >>> > RegionManager.rootScanner scanning meta region {server: > >> >> >>> 10.202.50.111:60020, > >> >> >>> > regionname: -ROOT-,,0, startKey: <>} > >> >> >>> > 2010-11-23 07:21:20,257 INFO > >> >> org.apache.hadoop.hbase.master.BaseScanner: > >> >> >>> > RegionManager.rootScanner scan of 1 row(s) of meta region > {server: > >> >> >>> > 10.202.50.111:60020, regionname: -ROOT-,,0, startKey: <>} > >> complete > >> >> >>> > 2010-11-23 07:21:22,838 INFO > >> >> org.apache.hadoop.hbase.master.BaseScanner: > >> >> >>> > RegionManager.metaScanner scanning meta region {server: > >> >> >>> 10.202.50.101:60020, > >> >> >>> > regionname: .META.,,1, startKey: <>} > >> >> >>> > 2010-11-23 07:21:24,731 INFO > >> >> org.apache.hadoop.hbase.master.BaseScanner: > >> >> >>> > RegionManager.metaScanner scan of 2086 row(s) of meta region > >> {server: > >> >> >>> > 10.202.50.101:60020, regionname: .META.,,1, startKey: <>} > >> complete > >> >> >>> > 2010-11-23 07:21:24,731 INFO > >> >> org.apache.hadoop.hbase.master.BaseScanner: > >> >> >>> All > >> >> >>> > 1 .META. region(s) scanned > >> >> >>> > > >> >> >>> > But I always got: > >> >> >>> > hbase(main):004:0> disable 'packageindex' > >> >> >>> > NativeException: org.apache.hadoop.hbase.RegionException: > Retries > >> >> >>> exhausted, > >> >> >>> > it took too long to wait for the table packageindex to be > >> disabled. > >> >> >>> > > >> >> >>> > What should I do to disable the table ? > >> >> >>> > > >> >> >>> > Thanks > >> >> >>> > > >> >> >>> > >> >> >> > >> >> >> > >> >> > > >> >> > >> > > >> > > >