[ 
https://issues.apache.org/jira/browse/HDFS-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved HDFS-1234.
-------------------------------

    Resolution: Duplicate

Resolved by HDFS-630

> Datanode 'alive' but with its disk failed, Namenode thinks it's alive
> ---------------------------------------------------------------------
>
>                 Key: HDFS-1234
>                 URL: https://issues.apache.org/jira/browse/HDFS-1234
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.20.1
>            Reporter: Thanh Do
>
> - Summary: Datanode 'alive' but with its disk failed, Namenode still thinks 
> it's alive
>  
> - Setups:
> + Replication = 1
> + # available datanodes = 2
> + # disks / datanode = 1
> + # failures = 1
> + Failure type = bad disk
> + When/where failure happens = first phase of the pipeline
>  
> - Details:
> In this experiment we have two datanodes. Each node has 1 disk.
> However, if one datanode has a failed disk (but the node is still alive), the 
> datanode
> does not keep track of this.  From the perspective of the namenode,
> that datanode is still alive, and thus the namenode gives back the same 
> datanode
> to the client.  The client will retry 3 times by asking the namenode to
> give a new set of datanodes, and always get the same datanode.
> And every time the client wants to write there, it gets an exception.
> This bug was found by our Failure Testing Service framework:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
> For questions, please email us: Thanh Do (than...@cs.wisc.edu) and
> Haryadi Gunawi (hary...@eecs.berkeley.edu)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to