Daryn Sharp created HDFS-5946: --------------------------------- Summary: Webhdfs DN choosing code is flawed Key: HDFS-5946 URL: https://issues.apache.org/jira/browse/HDFS-5946 Project: Hadoop HDFS Issue Type: Bug Components: namenode, webhdfs Affects Versions: 3.0.0, 2.4.0 Reporter: Daryn Sharp Priority: Critical
HDFS-5891 improved the performance of redirecting webhdfs clients to a DN. Instead of attempting a connection with a 1-minute timeout, the NN skips decommissioned nodes. The logic appears flawed. It finds the index of the first decommissioned node, if any, then: * Throws an exception if index = 0, even if other nodes later in the list are not decommissioned. * Else picks a random node prior to the index. Let's say there are 10 replicas, 2nd location is decommissioned. All clients will be redirected to the first location even though there are 8 other valid locations. -- This message was sent by Atlassian JIRA (v6.1.5#6160)