[ 
https://issues.apache.org/jira/browse/OOZIE-2526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nirav patel updated OOZIE-2526:
-------------------------------
    Description: 
Currently oozie spark action has spark-opts elements which are basically passed 
on to `org.apache.spark.deploy.SparkSubmit` as  spark configuration. In 
yarn-client mode this is too late and driver JVM is infact started when calling 
`org.apache.oozie.action.hadoop.SparkMain` class. Because oozie bypasses 
spark-submit.sh script and directly calls org.apache.spark.deploy.SparkSubmit. 
Hence even user specify --driver-memory=3g it has no effect on running jvm as 
it's already too late. I think oozie:launcher task which is a parent map-reduce 
job itself should launch its map task (spark driver) with some user specified 
JVM arguments.

Oozie spark action doc says:

The configuration element, if present, contains configuration properties that 
are passed to the Spark job. This is shouldn't be spark configuration. It 
should be mapreduce configuration for launcher job. I tried following but it 
doesn't gets applied to launcher mapreduce job which indicates it's being 
passed to spark.

<configuration>
<property>
        <name>mapreduce.map.memory.mb</name>
        <value>8192</value>
</property>
<property>
        <name>mapreduce.map.java.opts</name>
        <value>-Xmx7000m</value>
</property>
</configuration>



  was:
Currently oozie spark action has spark-opts elements which are basically passed 
on to `org.apache.spark.deploy.SparkSubmit` as  spark configuration. In 
yarn-client mode this is too late and driver JVM is infact started when calling 
`org.apache.oozie.action.hadoop.SparkMain` class. Because oozie bypasses 
spark-submit.sh script and directly calls org.apache.spark.deploy.SparkSubmit. 
Hence even user specify --driver-memory=3g it has no effect on running jvm as 
it's already too late. I think oozie:launcher task which is a parent map-reduce 
job itself should launch its map task (spark driver) with some user specified 
JVM arguments.



> Spark action have no way to specify spark driver jvm settings for yarn-client 
> mode
> ----------------------------------------------------------------------------------
>
>                 Key: OOZIE-2526
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2526
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: nirav patel
>
> Currently oozie spark action has spark-opts elements which are basically 
> passed on to `org.apache.spark.deploy.SparkSubmit` as  spark configuration. 
> In yarn-client mode this is too late and driver JVM is infact started when 
> calling 
> `org.apache.oozie.action.hadoop.SparkMain` class. Because oozie bypasses 
> spark-submit.sh script and directly calls 
> org.apache.spark.deploy.SparkSubmit. Hence even user specify 
> --driver-memory=3g it has no effect on running jvm as it's already too late. 
> I think oozie:launcher task which is a parent map-reduce job itself should 
> launch its map task (spark driver) with some user specified JVM arguments.
> Oozie spark action doc says:
> The configuration element, if present, contains configuration properties that 
> are passed to the Spark job. This is shouldn't be spark configuration. It 
> should be mapreduce configuration for launcher job. I tried following but it 
> doesn't gets applied to launcher mapreduce job which indicates it's being 
> passed to spark.
> <configuration>
> <property>
>       <name>mapreduce.map.memory.mb</name>
>       <value>8192</value>
> </property>
> <property>
>       <name>mapreduce.map.java.opts</name>
>       <value>-Xmx7000m</value>
> </property>
> </configuration>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to