force regionserver to halt
--------------------------
Key: HBASE-3814
URL: https://issues.apache.org/jira/browse/HBASE-3814
Project: HBase
Issue Type: Bug
Reporter: Prakash Khemani
Once abort() on a regionserver is called we should have a timeout thread that
does Runtime.halt() if the rs gets stuck somewhere during abort processing.
===
Pumahbase132 has following the logs .. the dfsclient is not able to set up a
write pipeline successfully ... it tries to abort ... but while aborting it
gets stuck. I know there is a check that if we are aborting because filesystem
is closed then we should not try to flush the logs while aborting. But in this
case the fs is up and running, just that it is not functioning.
2011-04-21 23:48:07,082 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
createBlockOutputStream 10.38.131.53:50010 for file
/PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280java.io.IOException:
Bad connect ack with firstBadLink 10.38.133.33:50010
2011-04-21 23:48:07,082 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block
blk_-8967376451767492285_6537229 for file
/PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280
2011-04-21 23:48:07,125 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
createBlockOutputStream 10.38.131.53:50010 for file
/PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280java.io.IOException:
Bad connect ack with firstBadLink 10.38.134.59:50010
2011-04-21 23:48:07,125 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block
blk_7172251852699100447_6537229 for file
/PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280
2011-04-21 23:48:07,169 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
createBlockOutputStream 10.38.131.53:50010 for file
/PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280java.io.IOException:
Bad connect ack with firstBadLink 10.38.134.53:50010
2011-04-21 23:48:07,169 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block
blk_-9153204772467623625_6537229 for file
/PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280
2011-04-21 23:48:07,213 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
createBlockOutputStream 10.38.131.53:50010 for file
/PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280java.io.IOException:
Bad connect ack with firstBadLink 10.38.134.49:50010
2011-04-21 23:48:07,213 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block
blk_-2513098940934276625_6537229 for file
/PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280
2011-04-21 23:48:07,214 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
Exception: java.io.IOException: Unable to create new block.
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3560)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2700(DFSClient.java:2720)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2977)
2011-04-21 23:48:07,214 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery
for block blk_-2513098940934276625_6537229 bad datanode[1] nodes == null
2011-04-21 23:48:07,214 WARN org.apache.hadoop.hdfs.DFSClient: Could not get
block locations. Source file
"/PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280"
- Aborting...
2011-04-21 23:48:07,216 FATAL org.apache.hadoop.hbase.regionserver.wal.HLog:
Could not append. Requesting close of hlog
And then the RS gets stuck trying to roll the logs ...
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira