Alexandre Fonseca created GIRAPH-814:
----------------------------------------
Summary: Incorrect MapReduce application classpath processing
Key: GIRAPH-814
URL: https://issues.apache.org/jira/browse/GIRAPH-814
Project: Giraph
Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Alexandre Fonseca
Attachments: GIRAPH-814.patch
*Symptom:*
Yarn ApplicationManager is unable to find mapred classes if user does not
override mapreduce.application.classpath in mapred-site.xml:
{code}(GiraphApplicationMaster.java:main(442)) - GiraphApplicationMaster caught
a t$
java.lang.NoClassDefFoundError:
org/apache/hadoop/mapreduce/lib/output/TextOutputFormat{code}
*Culprit:*
Processing of default mapreduce application classpath in
addLocalClasspathToEnv(..) YarnUtils.java:180
*Reasoning:*
YarnConfiguration.DEFAULT_YARN_APPLICATION_CLASSPATH is defined as an array of
strings as per YarnConfiguration.java:832:
{code} public static final String[] DEFAULT_YARN_APPLICATION_CLASSPATH = {
ApplicationConstants.Environment.HADOOP_CONF_DIR.$(),
ApplicationConstants.Environment.HADOOP_COMMON_HOME.$()
+ "/share/hadoop/common/*",
ApplicationConstants.Environment.HADOOP_COMMON_HOME.$()
+ "/share/hadoop/common/lib/*",
ApplicationConstants.Environment.HADOOP_HDFS_HOME.$()
+ "/share/hadoop/hdfs/*",
ApplicationConstants.Environment.HADOOP_HDFS_HOME.$()
+ "/share/hadoop/hdfs/lib/*",
ApplicationConstants.Environment.HADOOP_YARN_HOME.$()
+ "/share/hadoop/yarn/*",
ApplicationConstants.Environment.HADOOP_YARN_HOME.$()
+ "/share/hadoop/yarn/lib/*" };{code}
MRJobConfig.DEFAULT_MAPREDUCE_APPLICATION_CLASSPATH is defined as a comma
separated string as per MRJobConfig.java:679:
{code} public final String
DEFAULT_MAPREDUCE_APPLICATION_CLASSPATH = Shell.WINDOWS ?
"%HADOOP_MAPRED_HOME%\\share\\hadoop\\mapreduce\\*,"
+ "%HADOOP_MAPRED_HOME%\\share\\hadoop\\mapreduce\\lib\\*" :
"$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,"
+ "$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*";{code}
However, in YarnUtils.java:190, DEFAULT_MAPREDUCE_APPLICATION_CLASSPATH is
treated as if it were an array of strings just as YARN_APPLICATION_CLASSPATH
some lines before. This results in an incorrect classpath if the user relies on
the default setting of MAPREDUCE_APPLICATION_CLASSPATH (notice the comma
between the last 2 entries that should be a colon):
{code}13/12/09 21:54:56 INFO yarn.GiraphYarnClient: Environment for AM :
{CLASSPATH=${CLASSPATH}:./*:$HADOOP_CONF_DIR:
$HADOOP_COMMON_HOME/share/hadoop/common/*:
$HADOOP_COMMON_HOME/share/hadoop/common/lib/*:
$HADOOP_HDFS_HOME/share/hadoop/hdfs/*:
$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*:
$HADOOP_YARN_HOME/share/hadoop/yarn/*:
$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*:
$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,
$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*}{code}
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)