bq. 2016-05-13 11:56:52,763 WARN org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs in [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip- 172-31-54-241.ec2.internal,60020,1463123941413-splitting] installed = 1 but only 0 done
Looks like WAL splitting was slow or stalled. Please check region server log to see why. Cheers On Fri, May 13, 2016 at 8:45 AM, Gunnar Tapper <[email protected]> wrote: > Some more info. > > I remove /hbase using hbase zkcli rmr /hbaase. The log messages I provided > occurred after that. This is a HA configuration with two HMasters. > > After sitting in an initializing state for a long time, I end up with: > > hbase(main):001:0> list > TABLE > > > ERROR: Can't get master address from ZooKeeper; znode data == null > > Here is some help for this command: > List all tables in hbase. Optional regular expression parameter could > be used to filter the output. Examples: > > hbase> list > hbase> list 'abc.*' > hbase> list 'ns:abc.*' > hbase> list 'ns:.*' > > > HMaster log node 1: > > 2016-05-13 11:56:36,646 INFO > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned > = 0 > > tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update > = 1463140497694 last_version = 11 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update > = 1463140498292 last_version = 9 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update > = 1463140498292 last_version = 8 cur_worker_name = > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update > = 1463140497663 last_version = 8 cur_worker_name = > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} > 2016-05-13 11:56:41,647 INFO > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned > = 0 > > tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update > = 1463140497694 last_version = 11 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update > = 1463140498292 last_version = 9 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update > = 1463140498292 last_version = 8 cur_worker_name = > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update > = 1463140497663 last_version = 8 cur_worker_name = > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} > 2016-05-13 11:56:47,647 INFO > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned > = 0 > > tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update > = 1463140497694 last_version = 11 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update > = 1463140498292 last_version = 9 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update > = 1463140498292 last_version = 8 cur_worker_name = > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update > = 1463140497663 last_version = 8 cur_worker_name = > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} > 2016-05-13 11:56:52,648 INFO > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned > = 0 > > tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update > = 1463140497694 last_version = 11 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update > = 1463140498292 last_version = 9 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update > = 1463140498292 last_version = 8 cur_worker_name = > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update > = 1463140497663 last_version = 8 cur_worker_name = > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} > 2016-05-13 11:56:52,712 FATAL org.apache.hadoop.hbase.master.HMaster: > Failed to become active master > java.io.IOException: Timedout 300000ms waiting for namespace table to be > assigned > at > > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98) > at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:902) > at > > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:739) > at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169) > at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1484) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:56:52,720 FATAL org.apache.hadoop.hbase.master.HMaster: > Master server abort: loaded coprocessors are: [] > 2016-05-13 11:56:52,720 FATAL org.apache.hadoop.hbase.master.HMaster: > Unhandled exception. Starting shutdown. > java.io.IOException: Timedout 300000ms waiting for namespace table to be > assigned > at > > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98) > at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:902) > at > > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:739) > at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169) > at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1484) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:56:52,720 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Unhandled > exception. Starting shutdown. > 2016-05-13 11:56:52,720 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer > 2016-05-13 11:56:52,722 INFO org.mortbay.log: Stopped > [email protected]:60010 > 2016-05-13 11:56:52,759 WARN > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for > log splits to be completed > 2016-05-13 11:56:52,759 WARN > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-50-109.ec2.internal,60020,1463123941361-splitting] > installed = 1 but only 0 done > 2016-05-13 11:56:52,760 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: failed log splitting for > ip-172-31-50-109.ec2.internal,60020,1463123941361, will retry > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: error or interrupted while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-50-109.ec2.internal,60020,1463123941361-splitting] > Task = installed = 1 done = 0 error = 0 > at > > org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) > ... 4 more > 2016-05-13 11:56:52,761 WARN > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for > log splits to be completed > 2016-05-13 11:56:52,761 WARN > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for > log splits to be completed > 2016-05-13 11:56:52,761 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: Server is stopped > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:56:52,761 WARN > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for > log splits to be completed > 2016-05-13 11:56:52,763 WARN > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-54-241.ec2.internal,60020,1463123941413-splitting] > installed = 1 but only 0 done > 2016-05-13 11:56:52,763 WARN > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-61-36.ec2.internal,60020,1463123940830-splitting] > installed = 1 but only 0 done > 2016-05-13 11:56:52,763 WARN > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-53-252.ec2.internal,60020,1463123940875-splitting] > installed = 1 but only 0 done > 2016-05-13 11:56:52,763 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: failed log splitting for > ip-172-31-54-241.ec2.internal,60020,1463123941413, will retry > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: error or interrupted while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-54-241.ec2.internal,60020,1463123941413-splitting] > Task = installed = 1 done = 0 error = 0 > at > > org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) > ... 4 more > 2016-05-13 11:56:52,764 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: Server is stopped > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:56:52,764 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: Server is stopped > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:56:52,763 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: failed log splitting for > ip-172-31-53-252.ec2.internal,60020,1463123940875, will retry > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: error or interrupted while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-53-252.ec2.internal,60020,1463123940875-splitting] > Task = installed = 1 done = 0 error = 0 > at > > org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) > ... 4 more > 2016-05-13 11:56:52,763 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: failed log splitting for > ip-172-31-61-36.ec2.internal,60020,1463123940830, will retry > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: error or interrupted while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-61-36.ec2.internal,60020,1463123940830-splitting] > Task = installed = 1 done = 0 error = 0 > at > > org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) > ... 4 more > 2016-05-13 11:56:52,766 INFO org.apache.hadoop.hbase.master.CatalogJanitor: > CatalogJanitor-ip-172-31-50-109:60000 exiting > 2016-05-13 11:56:52,765 INFO > org.apache.hadoop.hbase.master.balancer.ClusterStatusChore: > ip-172-31-50-109.ec2.internal,60000,1463139946544-ClusterStatusChore > exiting > 2016-05-13 11:56:52,765 INFO > org.apache.hadoop.hbase.master.balancer.BalancerChore: > ip-172-31-50-109.ec2.internal,60000,1463139946544-BalancerChore exiting > 2016-05-13 11:56:52,765 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: Server is stopped > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:56:52,822 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server > ip-172-31-50-109.ec2.internal,60000,1463139946544 > 2016-05-13 11:56:52,822 INFO > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: > Closing zookeeper sessionid=0x254a9ee1aab0007 > 2016-05-13 11:56:52,824 INFO org.apache.zookeeper.ZooKeeper: Session: > 0x254a9ee1aab0007 closed > 2016-05-13 11:56:52,824 INFO org.apache.zookeeper.ClientCnxn: EventThread > shut down > 2016-05-13 11:56:52,824 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server > ip-172-31-50-109.ec2.internal,60000,1463139946544; all regions closed. > 2016-05-13 11:56:52,825 INFO > org.apache.hadoop.hbase.master.cleaner.LogCleaner: > ip-172-31-50-109:60000.oldLogCleaner exiting > 2016-05-13 11:56:52,825 INFO > org.apache.hadoop.hbase.master.cleaner.HFileCleaner: > ip-172-31-50-109:60000.archivedHFileCleaner exiting > 2016-05-13 11:56:52,825 INFO > org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner: Stopping > replicationLogCleaner-0x154a9ee1aab002c, > > quorum=ip-172-31-53-252.ec2.internal:2181,ip-172-31-54-241.ec2.internal:2181,ip-172-31-61-36.ec2.internal:2181, > baseZNode=/hbase > 2016-05-13 11:56:52,827 INFO org.apache.zookeeper.ZooKeeper: Session: > 0x154a9ee1aab002c closed > 2016-05-13 11:56:52,827 INFO org.apache.zookeeper.ClientCnxn: EventThread > shut down > 2016-05-13 11:56:52,828 INFO > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: > Closing zookeeper sessionid=0x354a9ee1ab10012 > 2016-05-13 11:56:52,829 INFO org.apache.zookeeper.ZooKeeper: Session: > 0x354a9ee1ab10012 closed > 2016-05-13 11:56:52,829 INFO org.apache.zookeeper.ClientCnxn: EventThread > shut down > 2016-05-13 11:56:52,830 INFO > org.apache.hadoop.hbase.master.SplitLogManager$TimeoutMonitor: > > ip-172-31-50-109.ec2.internal,60000,1463139946544.splitLogManagerTimeoutMonitor > exiting > 2016-05-13 11:56:52,830 INFO > org.apache.hadoop.hbase.procedure.flush.MasterFlushTableProcedureManager: > stop: server shutting down. > 2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer: > Stopping server on 60000 > 2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer: > RpcServer.listener,port=60000: stopping > 2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer: > RpcServer.responder: stopped > 2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer: > RpcServer.responder: stopping > 2016-05-13 11:56:52,833 INFO > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node > /hbase/rs/ip-172-31-50-109.ec2.internal,60000,1463139946544 already > deleted, retry=false > 2016-05-13 11:56:52,834 INFO org.apache.zookeeper.ZooKeeper: Session: > 0x254a9ee1aab0005 closed > 2016-05-13 11:56:52,834 INFO org.apache.zookeeper.ClientCnxn: EventThread > shut down > 2016-05-13 11:56:52,834 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server > ip-172-31-50-109.ec2.internal,60000,1463139946544; zookeeper connection > closed. > 2016-05-13 11:56:52,841 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > master/ip-172-31-50-109.ec2.internal/172.31.50.109:60000 exiting > [trafodion@ip-172-31-50-109 hbase]$ clear > > [trafodion@ip-172-31-50-109 hbase]$ tail -n 150 *MASTER*.log.out > 2016-05-13 11:56:21,643 INFO > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned > = 0 > > tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update > = 1463140497694 last_version = 11 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update > = 1463140498292 last_version = 9 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update > = 1463140498292 last_version = 8 cur_worker_name = > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update > = 1463140497663 last_version = 8 cur_worker_name = > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} > 2016-05-13 11:56:26,644 INFO > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned > = 0 > > tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update > = 1463140497694 last_version = 11 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update > = 1463140498292 last_version = 9 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update > = 1463140498292 last_version = 8 cur_worker_name = > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update > = 1463140497663 last_version = 8 cur_worker_name = > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} > 2016-05-13 11:56:31,645 INFO > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned > = 0 > > tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update > = 1463140497694 last_version = 11 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update > = 1463140498292 last_version = 9 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update > = 1463140498292 last_version = 8 cur_worker_name = > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update > = 1463140497663 last_version = 8 cur_worker_name = > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} > 2016-05-13 11:56:36,646 INFO > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned > = 0 > > tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update > = 1463140497694 last_version = 11 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update > = 1463140498292 last_version = 9 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update > = 1463140498292 last_version = 8 cur_worker_name = > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update > = 1463140497663 last_version = 8 cur_worker_name = > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} > 2016-05-13 11:56:41,647 INFO > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned > = 0 > > tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update > = 1463140497694 last_version = 11 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update > = 1463140498292 last_version = 9 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update > = 1463140498292 last_version = 8 cur_worker_name = > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update > = 1463140497663 last_version = 8 cur_worker_name = > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} > 2016-05-13 11:56:47,647 INFO > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned > = 0 > > tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update > = 1463140497694 last_version = 11 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update > = 1463140498292 last_version = 9 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update > = 1463140498292 last_version = 8 cur_worker_name = > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update > = 1463140497663 last_version = 8 cur_worker_name = > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} > 2016-05-13 11:56:52,648 INFO > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned > = 0 > > tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update > = 1463140497694 last_version = 11 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update > = 1463140498292 last_version = 9 cur_worker_name = > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update > = 1463140498292 last_version = 8 cur_worker_name = > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update > = 1463140497663 last_version = 8 cur_worker_name = > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} > 2016-05-13 11:56:52,712 FATAL org.apache.hadoop.hbase.master.HMaster: > Failed to become active master > java.io.IOException: Timedout 300000ms waiting for namespace table to be > assigned > at > > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98) > at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:902) > at > > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:739) > at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169) > at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1484) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:56:52,720 FATAL org.apache.hadoop.hbase.master.HMaster: > Master server abort: loaded coprocessors are: [] > 2016-05-13 11:56:52,720 FATAL org.apache.hadoop.hbase.master.HMaster: > Unhandled exception. Starting shutdown. > java.io.IOException: Timedout 300000ms waiting for namespace table to be > assigned > at > > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98) > at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:902) > at > > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:739) > at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169) > at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1484) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:56:52,720 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Unhandled > exception. Starting shutdown. > 2016-05-13 11:56:52,720 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer > 2016-05-13 11:56:52,722 INFO org.mortbay.log: Stopped > [email protected]:60010 > 2016-05-13 11:56:52,759 WARN > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for > log splits to be completed > 2016-05-13 11:56:52,759 WARN > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-50-109.ec2.internal,60020,1463123941361-splitting] > installed = 1 but only 0 done > 2016-05-13 11:56:52,760 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: failed log splitting for > ip-172-31-50-109.ec2.internal,60020,1463123941361, will retry > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: error or interrupted while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-50-109.ec2.internal,60020,1463123941361-splitting] > Task = installed = 1 done = 0 error = 0 > at > > org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) > ... 4 more > 2016-05-13 11:56:52,761 WARN > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for > log splits to be completed > 2016-05-13 11:56:52,761 WARN > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for > log splits to be completed > 2016-05-13 11:56:52,761 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: Server is stopped > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:56:52,761 WARN > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for > log splits to be completed > 2016-05-13 11:56:52,763 WARN > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-54-241.ec2.internal,60020,1463123941413-splitting] > installed = 1 but only 0 done > 2016-05-13 11:56:52,763 WARN > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-61-36.ec2.internal,60020,1463123940830-splitting] > installed = 1 but only 0 done > 2016-05-13 11:56:52,763 WARN > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-53-252.ec2.internal,60020,1463123940875-splitting] > installed = 1 but only 0 done > 2016-05-13 11:56:52,763 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: failed log splitting for > ip-172-31-54-241.ec2.internal,60020,1463123941413, will retry > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: error or interrupted while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-54-241.ec2.internal,60020,1463123941413-splitting] > Task = installed = 1 done = 0 error = 0 > at > > org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) > ... 4 more > 2016-05-13 11:56:52,764 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: Server is stopped > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:56:52,764 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: Server is stopped > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:56:52,763 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: failed log splitting for > ip-172-31-53-252.ec2.internal,60020,1463123940875, will retry > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: error or interrupted while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-53-252.ec2.internal,60020,1463123940875-splitting] > Task = installed = 1 done = 0 error = 0 > at > > org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) > ... 4 more > 2016-05-13 11:56:52,763 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: failed log splitting for > ip-172-31-61-36.ec2.internal,60020,1463123940830, will retry > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: error or interrupted while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-61-36.ec2.internal,60020,1463123940830-splitting] > Task = installed = 1 done = 0 error = 0 > at > > org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) > ... 4 more > 2016-05-13 11:56:52,766 INFO org.apache.hadoop.hbase.master.CatalogJanitor: > CatalogJanitor-ip-172-31-50-109:60000 exiting > 2016-05-13 11:56:52,765 INFO > org.apache.hadoop.hbase.master.balancer.ClusterStatusChore: > ip-172-31-50-109.ec2.internal,60000,1463139946544-ClusterStatusChore > exiting > 2016-05-13 11:56:52,765 INFO > org.apache.hadoop.hbase.master.balancer.BalancerChore: > ip-172-31-50-109.ec2.internal,60000,1463139946544-BalancerChore exiting > 2016-05-13 11:56:52,765 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: Server is stopped > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:56:52,822 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server > ip-172-31-50-109.ec2.internal,60000,1463139946544 > 2016-05-13 11:56:52,822 INFO > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: > Closing zookeeper sessionid=0x254a9ee1aab0007 > 2016-05-13 11:56:52,824 INFO org.apache.zookeeper.ZooKeeper: Session: > 0x254a9ee1aab0007 closed > 2016-05-13 11:56:52,824 INFO org.apache.zookeeper.ClientCnxn: EventThread > shut down > 2016-05-13 11:56:52,824 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server > ip-172-31-50-109.ec2.internal,60000,1463139946544; all regions closed. > 2016-05-13 11:56:52,825 INFO > org.apache.hadoop.hbase.master.cleaner.LogCleaner: > ip-172-31-50-109:60000.oldLogCleaner exiting > 2016-05-13 11:56:52,825 INFO > org.apache.hadoop.hbase.master.cleaner.HFileCleaner: > ip-172-31-50-109:60000.archivedHFileCleaner exiting > 2016-05-13 11:56:52,825 INFO > org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner: Stopping > replicationLogCleaner-0x154a9ee1aab002c, > > quorum=ip-172-31-53-252.ec2.internal:2181,ip-172-31-54-241.ec2.internal:2181,ip-172-31-61-36.ec2.internal:2181, > baseZNode=/hbase > 2016-05-13 11:56:52,827 INFO org.apache.zookeeper.ZooKeeper: Session: > 0x154a9ee1aab002c closed > 2016-05-13 11:56:52,827 INFO org.apache.zookeeper.ClientCnxn: EventThread > shut down > 2016-05-13 11:56:52,828 INFO > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: > Closing zookeeper sessionid=0x354a9ee1ab10012 > 2016-05-13 11:56:52,829 INFO org.apache.zookeeper.ZooKeeper: Session: > 0x354a9ee1ab10012 closed > 2016-05-13 11:56:52,829 INFO org.apache.zookeeper.ClientCnxn: EventThread > shut down > 2016-05-13 11:56:52,830 INFO > org.apache.hadoop.hbase.master.SplitLogManager$TimeoutMonitor: > > ip-172-31-50-109.ec2.internal,60000,1463139946544.splitLogManagerTimeoutMonitor > exiting > 2016-05-13 11:56:52,830 INFO > org.apache.hadoop.hbase.procedure.flush.MasterFlushTableProcedureManager: > stop: server shutting down. > 2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer: > Stopping server on 60000 > 2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer: > RpcServer.listener,port=60000: stopping > 2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer: > RpcServer.responder: stopped > 2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer: > RpcServer.responder: stopping > 2016-05-13 11:56:52,833 INFO > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node > /hbase/rs/ip-172-31-50-109.ec2.internal,60000,1463139946544 already > deleted, retry=false > 2016-05-13 11:56:52,834 INFO org.apache.zookeeper.ZooKeeper: Session: > 0x254a9ee1aab0005 closed > 2016-05-13 11:56:52,834 INFO org.apache.zookeeper.ClientCnxn: EventThread > shut down > 2016-05-13 11:56:52,834 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server > ip-172-31-50-109.ec2.internal,60000,1463139946544; zookeeper connection > closed. > 2016-05-13 11:56:52,841 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > master/ip-172-31-50-109.ec2.internal/172.31.50.109:60000 exiting > > > HMaster log node 2: > > 2016-05-13 11:51:16,362 INFO > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned > = 2 > > tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update > = 1463140223415 last_version = 8 cur_worker_name = > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress > incarnation = 2 resubmits = 2 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update > = -1 last_version = 5 cur_worker_name = null status = in_progress > incarnation = 2 resubmits = 2 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update > = -1 last_version = 4 cur_worker_name = null status = in_progress > incarnation = 2 resubmits = 2 batch = installed = 1 done = 0 error = 0, > > /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update > = 1463140222405 last_version = 5 cur_worker_name = > ip-172-31-61-36.ec2.internal,60020,1463139946328 status = in_progress > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} > 2016-05-13 11:51:17,050 FATAL org.apache.hadoop.hbase.master.HMaster: > Failed to become active master > java.io.IOException: Timedout 300000ms waiting for namespace table to be > assigned > at > > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98) > at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:902) > at > > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:739) > at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169) > at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1484) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:51:17,057 FATAL org.apache.hadoop.hbase.master.HMaster: > Master server abort: loaded coprocessors are: [] > 2016-05-13 11:51:17,058 FATAL org.apache.hadoop.hbase.master.HMaster: > Unhandled exception. Starting shutdown. > java.io.IOException: Timedout 300000ms waiting for namespace table to be > assigned > at > > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98) > at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:902) > at > > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:739) > at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169) > at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1484) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:51:17,058 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Unhandled > exception. Starting shutdown. > 2016-05-13 11:51:17,058 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer > 2016-05-13 11:51:17,059 INFO org.mortbay.log: Stopped > [email protected]:60010 > 2016-05-13 11:51:17,124 WARN > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for > log splits to be completed > 2016-05-13 11:51:17,124 WARN > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for > log splits to be completed > 2016-05-13 11:51:17,124 WARN > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-61-36.ec2.internal,60020,1463123940830-splitting] > installed = 1 but only 0 done > 2016-05-13 11:51:17,124 WARN > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-54-241.ec2.internal,60020,1463123941413-splitting] > installed = 1 but only 0 done > 2016-05-13 11:51:17,124 WARN > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for > log splits to be completed > 2016-05-13 11:51:17,124 WARN > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for > log splits to be completed > 2016-05-13 11:51:17,124 WARN > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-53-252.ec2.internal,60020,1463123940875-splitting] > installed = 1 but only 0 done > 2016-05-13 11:51:17,124 WARN > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-50-109.ec2.internal,60020,1463123941361-splitting] > installed = 1 but only 0 done > 2016-05-13 11:51:17,124 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: failed log splitting for > ip-172-31-61-36.ec2.internal,60020,1463123940830, will retry > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: error or interrupted while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-61-36.ec2.internal,60020,1463123940830-splitting] > Task = installed = 1 done = 0 error = 0 > at > > org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) > ... 4 more > 2016-05-13 11:51:17,126 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: Server is stopped > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:51:17,125 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: failed log splitting for > ip-172-31-54-241.ec2.internal,60020,1463123941413, will retry > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: error or interrupted while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-54-241.ec2.internal,60020,1463123941413-splitting] > Task = installed = 1 done = 0 error = 0 > at > > org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) > ... 4 more > 2016-05-13 11:51:17,125 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: failed log splitting for > ip-172-31-50-109.ec2.internal,60020,1463123941361, will retry > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: error or interrupted while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-50-109.ec2.internal,60020,1463123941361-splitting] > Task = installed = 1 done = 0 error = 0 > at > > org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) > ... 4 more > 2016-05-13 11:51:17,124 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: failed log splitting for > ip-172-31-53-252.ec2.internal,60020,1463123940875, will retry > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: error or interrupted while splitting logs > in > > [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-53-252.ec2.internal,60020,1463123940875-splitting] > Task = installed = 1 done = 0 error = 0 > at > > org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) > ... 4 more > 2016-05-13 11:51:17,128 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: Server is stopped > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:51:17,127 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: Server is stopped > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:51:17,127 ERROR > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while > processing event M_SERVER_SHUTDOWN > java.io.IOException: Server is stopped > at > > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:51:17,141 INFO > org.apache.hadoop.hbase.master.balancer.BalancerChore: > ip-172-31-54-241.ec2.internal,60000,1463139946494-BalancerChore exiting > 2016-05-13 11:51:17,141 INFO > org.apache.hadoop.hbase.master.balancer.ClusterStatusChore: > ip-172-31-54-241.ec2.internal,60000,1463139946494-ClusterStatusChore > exiting > 2016-05-13 11:51:17,143 INFO org.apache.hadoop.hbase.master.CatalogJanitor: > CatalogJanitor-ip-172-31-54-241:60000 exiting > 2016-05-13 11:51:17,160 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server > ip-172-31-54-241.ec2.internal,60000,1463139946494 > 2016-05-13 11:51:17,160 INFO > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: > Closing zookeeper sessionid=0x254a9ee1aab0006 > 2016-05-13 11:51:17,162 INFO org.apache.zookeeper.ZooKeeper: Session: > 0x254a9ee1aab0006 closed > 2016-05-13 11:51:17,162 INFO org.apache.zookeeper.ClientCnxn: EventThread > shut down > 2016-05-13 11:51:17,162 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server > ip-172-31-54-241.ec2.internal,60000,1463139946494; all regions closed. > 2016-05-13 11:51:17,163 INFO > org.apache.hadoop.hbase.master.cleaner.HFileCleaner: > ip-172-31-54-241:60000.archivedHFileCleaner exiting > 2016-05-13 11:51:17,163 INFO > org.apache.hadoop.hbase.master.cleaner.LogCleaner: > ip-172-31-54-241:60000.oldLogCleaner exiting > 2016-05-13 11:51:17,163 INFO > org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner: Stopping > replicationLogCleaner-0x154a9ee1aab0021, > > quorum=ip-172-31-53-252.ec2.internal:2181,ip-172-31-54-241.ec2.internal:2181,ip-172-31-61-36.ec2.internal:2181, > baseZNode=/hbase > 2016-05-13 11:51:17,165 INFO org.apache.zookeeper.ZooKeeper: Session: > 0x154a9ee1aab0021 closed > 2016-05-13 11:51:17,165 INFO org.apache.zookeeper.ClientCnxn: EventThread > shut down > 2016-05-13 11:51:17,166 INFO > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: > Closing zookeeper sessionid=0x154a9ee1aab0020 > 2016-05-13 11:51:17,167 INFO org.apache.zookeeper.ZooKeeper: Session: > 0x154a9ee1aab0020 closed > 2016-05-13 11:51:17,167 INFO org.apache.zookeeper.ClientCnxn: EventThread > shut down > 2016-05-13 11:51:17,167 INFO > org.apache.hadoop.hbase.master.SplitLogManager$TimeoutMonitor: > > ip-172-31-54-241.ec2.internal,60000,1463139946494.splitLogManagerTimeoutMonitor > exiting > 2016-05-13 11:51:17,167 INFO > org.apache.hadoop.hbase.procedure.flush.MasterFlushTableProcedureManager: > stop: server shutting down. > 2016-05-13 11:51:17,167 INFO org.apache.hadoop.hbase.ipc.RpcServer: > Stopping server on 60000 > 2016-05-13 11:51:17,168 INFO org.apache.hadoop.hbase.ipc.RpcServer: > RpcServer.listener,port=60000: stopping > 2016-05-13 11:51:17,168 INFO org.apache.hadoop.hbase.ipc.RpcServer: > RpcServer.responder: stopped > 2016-05-13 11:51:17,168 INFO org.apache.hadoop.hbase.ipc.RpcServer: > RpcServer.responder: stopping > 2016-05-13 11:51:17,170 INFO > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node > /hbase/rs/ip-172-31-54-241.ec2.internal,60000,1463139946494 already > deleted, retry=false > 2016-05-13 11:51:17,172 INFO org.apache.zookeeper.ZooKeeper: Session: > 0x354a9ee1ab10005 closed > 2016-05-13 11:51:17,172 INFO org.apache.zookeeper.ClientCnxn: EventThread > shut down > 2016-05-13 11:51:17,172 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server > ip-172-31-54-241.ec2.internal,60000,1463139946494; zookeeper connection > closed. > 2016-05-13 11:51:17,172 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > master/ip-172-31-54-241.ec2.internal/172.31.54.241:60000 exiting > > > > > On Fri, May 13, 2016 at 1:17 AM, Gunnar Tapper <[email protected]> > wrote: > > > Hi, > > > > I'm doing some development testing with Apache Trafodion running > > HBase Version 1.0.0-cdh5.4.5. > > > > All of a sudden, HBase has started to crash. First, it could not be > > recovered until I changed hbase_master_distributed_log_splitting to > false. > > At that point, HBase restarted and sat happily idling for 1 hour. Then, I > > started Trafodion letting it sit idling for 1 hour. > > > > I then started a workload and all RegionServers came crashing down. > > Looking at the log files, I suspected ZooKeeper issues so I restarted > > ZooKeeper and then HBase. Now, the HMaster fails with: > > > > 2016-05-13 07:13:52,521 INFO org.apache.hadoop.hbase.master.RegionStates: > > Transition {a33adb83f77095913adb4701b01c09a0 state=PENDING_OPEN, > > ts=1463123333157, > server=ip-172-31-50-109.ec2.internal,60020,1463122925684} > > to {a33adb83f77095913adb4701b01c09a0 state=OPENING, ts=1463123632517, > > server=ip-172-31-50-109.ec2.internal,60020,1463122925684} > > 2016-05-13 07:13:52,527 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: > > master:60000-0x354a8eaea3e007d, > > > quorum=ip-172-31-53-252.ec2.internal:2181,ip-172-31-54-241.ec2.internal:2181,ip-172-31-61-36.ec2.internal:2181, > > baseZNode=/hbase Unable to list children of znode > > /hbase/region-in-transition > > java.lang.InterruptedException > > at java.lang.Object.wait(Native Method) > > at java.lang.Object.wait(Object.java:503) > > at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1342) > > at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1466) > > at > > > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getChildren(RecoverableZooKeeper.java:296) > > at > > > org.apache.hadoop.hbase.zookeeper.ZKUtil.listChildrenAndWatchForNewChildren(ZKUtil.java:518) > > at > > > org.apache.hadoop.hbase.master.AssignmentManager$5.run(AssignmentManager.java:1420) > > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > > at > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > at > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > > at java.lang.Thread.run(Thread.java:745) > > 2016-05-13 07:13:52,527 INFO > > org.apache.hadoop.hbase.procedure.flush.MasterFlushTableProcedureManager: > > stop: server shutting down. > > 2016-05-13 07:13:52,527 INFO org.apache.hadoop.hbase.ipc.RpcServer: > > Stopping server on 60000 > > 2016-05-13 07:13:52,527 INFO org.apache.hadoop.hbase.ipc.RpcServer: > > RpcServer.listener,port=60000: stopping > > 2016-05-13 07:13:52,528 INFO org.apache.hadoop.hbase.ipc.RpcServer: > > RpcServer.responder: stopped > > 2016-05-13 07:13:52,528 INFO org.apache.hadoop.hbase.ipc.RpcServer: > > RpcServer.responder: stopping > > 2016-05-13 07:13:52,532 ERROR org.apache.zookeeper.ClientCnxn: Error > while > > calling watcher > > java.util.concurrent.RejectedExecutionException: Task > > java.util.concurrent.FutureTask@33d4a2bd rejected from > > java.util.concurrent.ThreadPoolExecutor@4d0840e0[Terminated, pool size = > > 0, active threads = 0, queued tasks = 0, completed tasks = 38681] > > at > > > java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048) > > at > > > java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821) > > at > > > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372) > > at > > > java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:110) > > at > > > org.apache.hadoop.hbase.master.AssignmentManager.zkEventWorkersSubmit(AssignmentManager.java:1285) > > at > > > org.apache.hadoop.hbase.master.AssignmentManager.handleAssignmentEvent(AssignmentManager.java:1479) > > at > > > org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:1244) > > at > > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:458) > > at > > > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) > > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > > 2016-05-13 07:13:52,533 INFO > > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node > > /hbase/rs/ip-172-31-50-109.ec2.internal,60000,1463122925543 already > > deleted, retry=false > > 2016-05-13 07:13:52,534 INFO org.apache.zookeeper.ZooKeeper: Session: > > 0x354a8eaea3e007d closed > > 2016-05-13 07:13:52,534 INFO > > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server > > ip-172-31-50-109.ec2.internal,60000,1463122925543; zookeeper connection > > closed. > > 2016-05-13 07:13:52,534 INFO > > org.apache.hadoop.hbase.regionserver.HRegionServer: > > master/ip-172-31-50-109.ec2.internal/172.31.50.109:60000 exiting > > 2016-05-13 07:13:52,534 INFO org.apache.zookeeper.ClientCnxn: EventThread > > shut down > > > > Suggestions on how to move forward so that I can recover this system? > > > > -- > > Thanks, > > > > Gunnar > > *If you think you can you can, if you think you can't you're right.* > > > > > > -- > Thanks, > > Gunnar > *If you think you can you can, if you think you can't you're right.* >
