Daryn Sharp created HDFS-5946:
---------------------------------
Summary: Webhdfs DN choosing code is flawed
Key: HDFS-5946
URL: https://issues.apache.org/jira/browse/HDFS-5946
Project: Hadoop HDFS
Issue Type: Bug
Components: namenode, webhdfs
Affects Versions: 3.0.0, 2.4.0
Reporter: Daryn Sharp
Priority: Critical
HDFS-5891 improved the performance of redirecting webhdfs clients to a DN.
Instead of attempting a connection with a 1-minute timeout, the NN skips
decommissioned nodes.
The logic appears flawed. It finds the index of the first decommissioned node,
if any, then:
* Throws an exception if index = 0, even if other nodes later in the list are
not decommissioned.
* Else picks a random node prior to the index. Let's say there are 10
replicas, 2nd location is decommissioned. All clients will be redirected to
the first location even though there are 8 other valid locations.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)