Clay B. created HBASE-17287: ------------------------------- Summary: Master becomes a zombie if filesystem object closes Key: HBASE-17287 URL: https://issues.apache.org/jira/browse/HBASE-17287 Project: HBase Issue Type: Bug Components: master Reporter: Clay B.
We have seen an issue whereby if the HDFS is unstable and the HBase master's HDFS client is unable to stabilize before {{dfs.client.failover.max.attempts}} then the master's filesystem object closes. This seems to result in an HBase master which will continue to run (process and znode exists) but no meaningful work can be done (e.g. assigning meta).What we saw in our HBase master logs was:{code}2016-12-01 19:19:08,192 ERROR org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler: Caught M_META_SERVER_SHUTDOWN, count=1java.io.IOException: failed log splitting for cluster-r5n12.bloomberg.com,60200,1480632863218, will retryat org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:84)at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at java.lang.Thread.run(Thread.java:745)Caused by: java.io.IOException: Filesystem closed{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)