Your DNS is setup wrong: org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo found for datanode001,60020,1322010465849
That's what that line means. The master did detect that a region server went down, but the name was unknown to it. J-D On Mon, Dec 5, 2011 at 10:45 AM, sagar naik <[email protected]> wrote: > Scenario: > - regionserver machine was rebooted . AWS random reboot . > Regionserver logs show shutdown > > - Datanode which is on same machine also recvd a kill command > > - Master did not migrate the regions. It did detect the node down > > - I checked after 6 hours , the hbase was in inconsistent state . (hbck) > > - When I restarted the regionserver it was giving me a > UnknownScannerException > > - I ended up, restart master only and then regionserver would start fine > > - After about 1 hr of regionserver going down, major compaction > (croned) kicked in > > Question: > > Why did regions on the regionserver did not migrate ? am I missing > something, some config params. > Most of the config is default except for compaction interval > > Thanks > > > REGIONSERVER LOGS: > > 2011-12-04 22:19:09,586 INFO > org.apache.hadoop.hbase.regionserver.wal.HLog: moving old hlog file > /user/nileus/hbase-storage/.logs/ip-X-X-X-X,60020,1322010465849/ip-10-174-43-151.us-west-1.compute.internal%3A60020.1322967102356 > whose highest sequenceid is 127636589 to > /user/nileus/hbase-storage/.oldlogs/ip-X-X-x-X%3A60020.1322967102356 > 2011-12-04 22:43:47,291 INFO > org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook > starting; hbase.shutdown.hook=true; > fsShutdownHook=Thread[Thread-15,5,main] > 2011-12-04 22:43:47,291 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Shutdown > hook > > > MASTER LOG: > 2011-12-04 22:46:39,133 INFO > org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer > ephemeral node deleted, processing expiration > [datanode001,60020,1322010465849] > 2011-12-04 22:46:39,133 INFO > org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo > found for datanode001,60020,1322010465849 > 2011-12-04 22:47:20,828 INFO > org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing. > servers=3 regions=421 average=140.33333 mostloaded=141 leastloaded=141 > 2011-12-04 22:52:20,893 INFO > org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing. > servers=3 regions=421 average=140.33333 mostloaded=141 leastloaded=141 > 2011-12-04 22:57:20,959 INFO > org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing. > servers=3 regions=421 average=140.33333 mostloaded=141 leastloaded=141 > 2 > > > POST RESTART REGIONSERVER LOGS > > 2011-12-05 07:21:04,589 ERROR > org.apache.hadoop.hbase.regionserver.HRegionServer: > org.apache.hadoop.hbase.UnknownScannerException: Name: -1 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1809) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039) > 2011-12-05 07:21:14,469 ERROR > org.apache.hadoop.hbase.regionserver.HRegionServer: > org.apache.hadoop.hbase.UnknownScannerException: Name: -1 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1809) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039) > Mon Dec 5 07:21:25 PST 2011 Killing regionserver > 2011-12-05 07:21:25,647 INFO > org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook > starting; hbase.shutdown.hook=true; > fsShutdownHook=Thread[Thread-15,5,main] > 2011-12-05 07:21:25,647 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Shutdown > hook
