Philip Zeyliger created HADOOP-15019: ----------------------------------------
Summary: Hadoop shell script classpath de-duping ignores HADOOP_USER_CLASSPATH_FIRST Key: HADOOP-15019 URL: https://issues.apache.org/jira/browse/HADOOP-15019 Project: Hadoop Common Issue Type: Bug Components: bin Reporter: Philip Zeyliger If a user sets {{HADOOP_USER_CLASSPATH_FIRST=true}} and furthermore includes a directory that's already in Hadoop's classpath via {{HADOOP_CLASSPATH}}, that directory will appear later than it should in the eventual $CLASSPATH. I believe this is because the de-duping at https://github.com/apache/hadoop/blob/cbc632d9abf08c56a7fc02be51b2718af30bad28/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh#L1200 is ignoring the "before/after" parameter. My way of reproduction, first build the following trivial Java program: {code} $cat Test.java public class Test { public static void main(String[]args) { System.out.println(System.getenv().get("CLASSPATH")); } } $javac Test.java $jar cf test.jar Test.class {code} With that, if you happen to have an entry in HADOOP_CLASSPATH that matches what Hadoop would produce, you'll find the ordering not honored. It's easiest to reproduce this with a match for HADOOP_CONF_DIR, as in the second case below: {code} # As you'd expect, /usr/share is first! $HADOOP_CONF_DIR=/etc HADOOP_USER_CLASSPATH_FIRST="true" HADOOP_CLASSPATH=/usr/share:/tmp:/bin bin/hadoop jar test.jar Test | tr ':' '\n' | grep -n . | grep '/usr/share' WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete. 1:/usr/share # Surprise! /usr/share is now in the 3rd line, even thought it was first in HADOOP_CLASSPATH. $HADOOP_CONF_DIR=/usr/share HADOOP_USER_CLASSPATH_FIRST="true" HADOOP_CLASSPATH=/usr/share:/tmp:/bin bin/hadoop jar test.jar Test | tr ':' '\n' | grep -n . | grep '/usr/share' WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete. 3:/usr/share {code} To re-iterate, what's surprising is that you can make an entry that's first in HADOOP_USER_CLASSPATH show up not first in the resulting classpath. I ran into this configuring {{bin/hive}} with a confdir that was being used for both HDFS and Hive, and flailing as to why my {{log4j2.properties}} wasn't being read. The one in my conf dir was lower in my classpath than one bundled in some Hive jar. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org