Scenario:
- regionserver machine was rebooted . AWS random reboot .
Regionserver logs show shutdown
- Datanode which is on same machine also recvd a kill command
- Master did not migrate the regions. It did detect the node down
- I checked after 6 hours , the hbase was in inconsistent state . (hbck)
- When I restarted the regionserver it was giving me a UnknownScannerException
- I ended up, restart master only and then regionserver would start fine
- After about 1 hr of regionserver going down, major compaction
(croned) kicked in
Question:
Why did regions on the regionserver did not migrate ? am I missing
something, some config params.
Most of the config is default except for compaction interval
Thanks
REGIONSERVER LOGS:
2011-12-04 22:19:09,586 INFO
org.apache.hadoop.hbase.regionserver.wal.HLog: moving old hlog file
/user/nileus/hbase-storage/.logs/ip-X-X-X-X,60020,1322010465849/ip-10-174-43-151.us-west-1.compute.internal%3A60020.1322967102356
whose highest sequenceid is 127636589 to
/user/nileus/hbase-storage/.oldlogs/ip-X-X-x-X%3A60020.1322967102356
2011-12-04 22:43:47,291 INFO
org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook
starting; hbase.shutdown.hook=true;
fsShutdownHook=Thread[Thread-15,5,main]
2011-12-04 22:43:47,291 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Shutdown
hook
MASTER LOG:
2011-12-04 22:46:39,133 INFO
org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer
ephemeral node deleted, processing expiration
[datanode001,60020,1322010465849]
2011-12-04 22:46:39,133 INFO
org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo
found for datanode001,60020,1322010465849
2011-12-04 22:47:20,828 INFO
org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing.
servers=3 regions=421 average=140.33333 mostloaded=141 leastloaded=141
2011-12-04 22:52:20,893 INFO
org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing.
servers=3 regions=421 average=140.33333 mostloaded=141 leastloaded=141
2011-12-04 22:57:20,959 INFO
org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing.
servers=3 regions=421 average=140.33333 mostloaded=141 leastloaded=141
2
POST RESTART REGIONSERVER LOGS
2011-12-05 07:21:04,589 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.UnknownScannerException: Name: -1
at
org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1809)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
2011-12-05 07:21:14,469 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.UnknownScannerException: Name: -1
at
org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1809)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
Mon Dec 5 07:21:25 PST 2011 Killing regionserver
2011-12-05 07:21:25,647 INFO
org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook
starting; hbase.shutdown.hook=true;
fsShutdownHook=Thread[Thread-15,5,main]
2011-12-05 07:21:25,647 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Shutdown
hook