Sergey Tryuber created MAHOUT-1762:
--------------------------------------

             Summary: Pick up $SPARK_HOME/conf/spark-defaults.conf on startup
                 Key: MAHOUT-1762
                 URL: https://issues.apache.org/jira/browse/MAHOUT-1762
             Project: Mahout
          Issue Type: Wish
          Components: spark
            Reporter: Sergey Tryuber


[spark-defaults.conf|http://spark.apache.org/docs/latest/configuration.html#dynamically-loading-spark-properties]
 is aimed to contain global configuration for Spark cluster. For example, in 
our HDP2.2 environment it contains:
{noformat}
spark.driver.extraJavaOptions      -Dhdp.version=2.2.0.0–2041
spark.yarn.am.extraJavaOptions     -Dhdp.version=2.2.0.0–2041
{noformat}
and there are many other good things. Actually it is expected that when a user 
starts Spark Shell, it will be working fine. Unfortunately this does not 
happens with Mahout Spark Shell, because it ignores spark configuration and 
user has to copy-past lots of options into _MAHOUT_OPTS_.

This happens because 
[org.apache.mahout.sparkbindings.shell.Main|https://github.com/apache/mahout/blob/master/spark-shell/src/main/scala/org/apache/mahout/sparkbindings/shell/Main.scala]
 is executed directly in [initialization 
script|https://github.com/apache/mahout/blob/master/bin/mahout]:
{code}
"$JAVA" $JAVA_HEAP_MAX $MAHOUT_OPTS -classpath "$CLASSPATH" 
"org.apache.mahout.sparkbindings.shell.Main" $@
{code}
In contrast, in Spark shell is indirectly invoked through spark-submit in 
[spark-shell|https://github.com/apache/spark/blob/master/bin/spark-shell] 
script:
{code}
"$FWDIR"/bin/spark-submit --class org.apache.spark.repl.Main "$@"
{code}
[SparkSubmit|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala]
 contains an additional initialization layer for loading properties file (see 
SparkSubmitArguments#mergeDefaultSparkProperties method).

So there are two possible solutions:
* use proper Spark-like initialization logic
* use thin envelope like it is in H2O Sparkling Water 
([sparkling-shell|https://github.com/h2oai/sparkling-water/blob/master/bin/sparkling-shell])




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to