[ https://issues.apache.org/jira/browse/HDFS-11507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chen Liang resolved HDFS-11507. ------------------------------- Resolution: Not A Problem > NetworkTopology#chooseRandom may run into a dead loop due to race condition > --------------------------------------------------------------------------- > > Key: HDFS-11507 > URL: https://issues.apache.org/jira/browse/HDFS-11507 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Reporter: Chen Liang > Assignee: Chen Liang > > {{NetworkTopology#chooseRandom()}} works as: > 1. counts the number of available nodes as {{availableNodes}}, > 2. checks how many nodes are excluded, deduct from {{availableNodes}} > 3. if {{availableNodes}} still > 0, then there are nodes available. > 4. keep looping to find that node > But now imagine, in the meantime, the actually available nodes got removed in > step 3 or step 4, and all remaining nodes are excluded nodes. Then, although > there are no more nodes actually available, the code would still run as > {{availableNodes}} > 0, and then it would keep getting excluded node and loop > forever, as > {{if (excludedNodes == null || !excludedNodes.contains(ret))}} > will always be false. > We may fix this by expanding the while loop to also include the > {{availableNodes}} calculation. Such that we re-calculate {{availableNodes}} > every time it fails to find an available node. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org