[
https://issues.apache.org/jira/browse/HDFS-1384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
dhruba borthakur resolved HDFS-1384.
------------------------------------
Resolution: Duplicate
This bug has been fixed in trunk because the client sends the excluded list to
the namenode with the addBlock RPC. The NN ensures that it does not return a
datanode from the excluded list.
This bug is still present in the 0.20-append branch
> NameNode should give client the first node in the pipeline from different
> rack other than that of excludedNodes list in the same rack.
> ---------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-1384
> URL: https://issues.apache.org/jira/browse/HDFS-1384
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 0.20-append, 0.20.1
> Reporter: Thanh Do
>
> We saw a case that NN keeps giving client nodes from the same rack, hence an
> exception
> from client when try to setup the pipeline. Client retries 5 times and fails.
>
> Here is more details. Support we have 2 rack
> - Rack 0: from dn1 to dn7
> - Rack 1: from dn8 to dn14
> Client asks for 3 dns and NN replies with dn1, dn8 and dn9, for example.
> Because there is network partition, so client doesn't see any node in Rack 0.
> Hence, client add dn1 to excludedNodes list, and ask NN again.
> Interestingly, NN picks a different node (from those in excludedNodes) in
> Rack 0,
> and gives back to client, and so on. Client keeps retrying and after 5 times
> of retrials,
> write fails.
> This bug was found by our Failure Testing Service framework:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
> For questions, please email us: Thanh Do ([email protected]) and
> Haryadi Gunawi ([email protected])
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.