[
https://issues.apache.org/jira/browse/SPARK-4461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221502#comment-14221502
]
Zhan Zhang edited comment on SPARK-4461 at 11/21/14 10:23 PM:
--------------------------------------------------------------
Thanks for the information Marcelo. I changed the title to reflect the change.
It handles a different issue. But the PR you referred should also be fixed.
Currently, there is no way to pass yarn am specific java options. It cause some
potential issues when reading classpath from hadoop configuration file. Hadoop
configuration actually replace variables in its property with the system
property passed in java options. How to specify the value depends on different
hadoop distribution.
The new options are SPARK_YARN_JAVA_OPTS or spark.yarn.extraJavaOptions. I make
it as spark global level, because typically we don't want user to specify this
in their command line each time submitting spark job after it is setup in
spark-defaults.conf.
In addition, with this new extra options enabled to be passed to AM, it
provides more flexibility. How to specify the value
For example int the following valid mapred-site.xml file, we have the class
path which specify values using system property. Hadoop can correctly handle it
because it has java options passed in.
mapreduce.application.classpath
/etc/hadoop/${hadoop.version}/mapreduce/*
In the meantime, we cannot relies on mapreduce.admin.map.child.java.opts in
mapred-site.xml, because it has its own extra java options specified, which
does not apply to Spark.
was (Author: zzhan):
Thanks for the information Marcelo. I changed the title to reflect the change.
> Pass java options to yarn master to handle system properties correctly.
> -----------------------------------------------------------------------
>
> Key: SPARK-4461
> URL: https://issues.apache.org/jira/browse/SPARK-4461
> Project: Spark
> Issue Type: Bug
> Components: YARN
> Reporter: Zhan Zhang
>
> Currently spark read mapred-site.xml to get the class path. From hadoop 2.6,
> the library is shipped to cluster with distributed cache at run-time, and may
> not be available at every node manager.
> Instead of relying on mapred-site.xml, spark should handle this by its own,
> for example, through ADD_JARs, SPARK_CLASSPATH, etc
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]