Hi all,

We have a livy.conf with the following:

livy.spark.master = spark://my.master.host:7077
livy.spark.deploy-mode = client

According to 
https://community.hortonworks.com/articles/151164/how-to-submit-spark-application-through-livy-rest.html
 
, we should be able to override it via setting "conf" within the JSON body 
like:

"conf": {"spark.master": "spark://my.master.host:6066", 
"spark.submit.deployMode": "cluster"}

However, it seems that the values from livy.conf has higher precedence 
over the values supplied in JSON's "conf".  In Session.scala:

    val masterConfList = Map(LivyConf.SPARK_MASTER -> 
livyConf.sparkMaster()) ++
      livyConf.sparkDeployMode().map(LivyConf.SPARK_DEPLOY_MODE -> 
_).toMap

    conf ++ masterConfList ++ merged

masterConfList's "spark.master" gets it value from livy.conf's 
"livy.spark.master", or "local" if "livy.spark.master" is not set. 
masterConfList's "spark.submit.deployMode" gets its value from livy.conf's 
"livy.spark.deploy-mode", or None if "livy.spark.deploy-mode" is not set.

Ultimately, masterConfList's values overrides conf's. 
livyConf.sparkMaster() always returns a value so it will always override 
whatever "spark.master" is in the JSON.  If "livy.spark.deploy-mode" is 
specified in livy.conf it will always override "spark.submit.deployMode" 
in the JSON as well.  That code was introduced in LIVY-157, and the 
closest JIRA I found about this is 
https://issues.apache.org/jira/browse/LIVY-265.

Another effect of this is "livy.spark.master" practically pre-dictates the 
deploy-mode one should be using.  While it's possible to specify 
"spark.submit.deployMode" in the JSON body and leave out 
"spark.submit.deployMode" in livy.conf, it still has to match the 
deploy-mode that the port number in "livy.spark.master" expects.

I wonder if this behavior is intentional?  And is there another method to 
override "livy.spark.master" and "livy.spark.deploy-mode"? 

Thanks!



Regards,  
  
 TIN HANG TO  
 IBM Open Data Analytics for z/OS


Reply via email to