nkeywal created HDFS-3703:
-----------------------------

             Summary: Decrease the datanode failure detection time
                 Key: HDFS-3703
                 URL: https://issues.apache.org/jira/browse/HDFS-3703
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: data-node, name-node
    Affects Versions: 2.0.0-alpha, 1.0.3
            Reporter: nkeywal


By default, if a box dies, the datanode will be marked as dead by the namenode 
after 10:30 minutes. In the meantime, this datanode will still be proposed  by 
the nanenode to write blocks or to read replicas. It happens as well if the 
datanode crashes: there is no shutdown hooks to tell the nanemode we're not 
there anymore.
It especially an issue with HBase. HBase regionserver timeout for production is 
often 30s. So with these configs, when a box dies HBase starts to recover after 
30s and, while 10 minutes, the namenode will consider the blocks on the same 
box as available. Beyond the write errors, this will trigger a lot of missed 
reads:
- during the recovery, HBase needs to read the blocks used on the dead box (the 
ones in the 'HBase Write-Ahead-Log')
- after the recovery, reading these data blocks (the 'HBase region') will fail 
33% of the time with the default number of replica, slowering the data access, 
especially when the errors are socket timeout (i.e. around 60s most of the 
time). 

Globally, it would be ideal if HDFS settings could be under HBase settings. 
As a side note, HBase relies on ZooKeeper to detect regionservers issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to