[ https://issues.apache.org/jira/browse/HDFS-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420919#comment-13420919 ]
Eli Collins commented on HDFS-3703: ----------------------------------- Btw the formula for determining whether a DN is dead is {{2 * dfs.namenode.heartbeat.recheck-interval + 10 * 1000 * dfs.heartbeat.interval}}, the defaults are 5 * 60 *1000 and 3 respectively. Note that it's a large value by default for a reason, eg so a network hiccup doesn't introduce a replication storm that results in cascading failures. Would a "decommission DN immediately" NN API suffice for your use case? > Decrease the datanode failure detection time > -------------------------------------------- > > Key: HDFS-3703 > URL: https://issues.apache.org/jira/browse/HDFS-3703 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, name-node > Affects Versions: 1.0.3, 2.0.0-alpha > Reporter: nkeywal > > By default, if a box dies, the datanode will be marked as dead by the > namenode after 10:30 minutes. In the meantime, this datanode will still be > proposed by the nanenode to write blocks or to read replicas. It happens as > well if the datanode crashes: there is no shutdown hooks to tell the nanemode > we're not there anymore. > It especially an issue with HBase. HBase regionserver timeout for production > is often 30s. So with these configs, when a box dies HBase starts to recover > after 30s and, while 10 minutes, the namenode will consider the blocks on the > same box as available. Beyond the write errors, this will trigger a lot of > missed reads: > - during the recovery, HBase needs to read the blocks used on the dead box > (the ones in the 'HBase Write-Ahead-Log') > - after the recovery, reading these data blocks (the 'HBase region') will > fail 33% of the time with the default number of replica, slowering the data > access, especially when the errors are socket timeout (i.e. around 60s most > of the time). > Globally, it would be ideal if HDFS settings could be under HBase settings. > As a side note, HBase relies on ZooKeeper to detect regionservers issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira