[ https://issues.apache.org/jira/browse/HADOOP-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Allen Wittenauer updated HADOOP-11668: -------------------------------------- Attachment: HADOOP-11668-03.patch -03: * just switch to a loop entirely. see below.... bq. IMO, best way to solve this is by making hostnames delimited by (,) Nope, definitely not. The start of the problems are definitely here: {code} argv=(${HADOOP_USER_PARAMS[@]/start}) {code} This construction has two key issues: Without quotes, the array of HADOOP_USER_PARAMS will always have its metachars expanded. This means that an array of 4 elements will now become 4+n elements, depending upon what else is in there. So if a user passes: {code} --config "my cool dir" {code} elements 2 3 4 just got expanded into my, cool, and dir rather than just the single "my cool dir". So if we change the construction to {code} argv=("${HADOOP_USER_PARAMS[@]/start}") {code} That element expansion no longer happens. But now we've introduced a new problem. We're doing a substitution, but turning those elements empty! This is where the empty parameter problem comes in, because this means that if we had: {code} hadoop-daemons.sh --hostnames "1 2" start namenode {code} We'd end up with: argv[0]=--hostnames argv[1]="1 2" argv[2]="" argv[3]="namenode" after the substitution. Then when we get to "${argv[@]}" on the exec line, it turns into: hdfs --slaves --daemon "start" --hostnames "1 2" "" namenode Thus we also need to filter out this empty element array. So why don't we just switch to using commas here? Because as evidenced by the above, it doesn't actually fix all the problems with metachar expansion. If any other parameter has them, it's going to blow up in our face. The other problem we've got is backward compatibility. A lot of people use hadoop/yarn-daemons.sh in scripts, and changing this to use commas would be a pretty hefty tax especially when we know we can fix it another way. One of the goals I had in mind with this code was to avoid a loop. But there's still another problem here: if a hostname has start, stop, or status, it's going to get removed. Since we already have the loop now to deal with the empty element, we might as well fix that bug too by using a loop rather than cheating. We still have a problem if some other param is specifically start/stop/status (e.g., --config start), but there's not much we can do about that without building a pretty complex test for what mode we're in. > start-dfs.sh and stop-dfs.sh no longer works in HA mode after --slaves shell > option > ----------------------------------------------------------------------------------- > > Key: HADOOP-11668 > URL: https://issues.apache.org/jira/browse/HADOOP-11668 > Project: Hadoop Common > Issue Type: Bug > Components: scripts > Reporter: Vinayakumar B > Assignee: Allen Wittenauer > Attachments: HADOOP-11668-01.patch, HADOOP-11668-02.patch, > HADOOP-11668-03.patch > > > After introduction of "--slaves" option for the scripts, start-dfs.sh and > stop-dfs.sh will no longer work in HA mode. > This is due to multiple hostnames passed for '--hostnames' delimited with > space. > These hostnames are treated as commands and script fails. > So, instead of delimiting with space, delimiting with comma(,) before passing > to hadoop-daemons.sh will solve the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)