[jira] [Created] (OOZIE-3286) [spark-action] Call SparkLauncher from Oozie server JVM

Andras Piros (JIRA) Wed, 13 Jun 2018 05:52:21 -0700

Andras Piros created OOZIE-3286:
-----------------------------------

             Summary: [spark-action] Call SparkLauncher from Oozie server JVM
                 Key: OOZIE-3286
                 URL: https://issues.apache.org/jira/browse/OOZIE-3286
             Project: Oozie
          Issue Type: New Feature
          Components: action, core
    Affects Versions: 5.0.0
            Reporter: Andras Piros
            Assignee: Andras Piros



This is a major refactor / rework of the Spark action.

Today Oozie Spark actions run as follows:
# {{SparkActionExecutor extends JavaActionExecutor}} (running as part of Oozie 
server code) launches a {{SparkMain}} (running as YARN application) the usual 
way on YARN using {{LauncherAM}}
# {{SparkMain}} runs on a YARN NodeManager container and calls from the same 
JVM 
[*{{SparkSubmit}}*|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala]
# {{SparkSubmit}} fires up a Spark driver:
** in {{local}} or {{yarn-client}} mode in the same YARN NodeManager JVM where 
{{SparkMain}} runs. In {{yarn-client}} mode Spark's 
[*{{ApplicationMaster}}*|https://github.com/apache/spark/blob/master/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala]
 runs inside the same JVM where the driver runs
** [*in {{yarn-cluster}} 
mode*|https://spark.apache.org/docs/latest/running-on-yarn.html#launching-spark-on-yarn]
 Spark driver runs in a different JVM than Spark's ApplicationMaster, maybe on 
a different host. It can go away after the YARN application has been submitted

Problems with this approach are:
* too many levels of indirection cause lots of latency, and a whole new world 
of communication errors can happen
* since {{SparkSubmit}} is launched from the same JVM where {{SparkMain}} runs, 
the Spark application will share all the environment variables, classpath, 
sharelib dependencies etc. with Oozie's Spark launcher code, causing 
hard-to-nail-down environment and classpath issues

The future of Oozie Spark action launching looks like this:
# {{SparkActionExecutor}} (running as part of Oozie server code) launches an 
[*{{InProcessLauncher}}*|https://github.com/apache/spark/blob/master/launcher/src/main/java/org/apache/spark/launcher/InProcessLauncher.java]
 (available since Spark 2.3) always in {{yarn-cluster}} mode
# {{InProcessLauncher}} calls either {{InProcessSparkSubmit}} or 
{{SparkSubmit}}. We translate any Spark application modes to {{yarn-cluster}}, 
so that Spark driver will run in a JVM different than the Spark driver
# this allows for much better resource usage, since we have one YARN 
ApplicationMaster (Oozie's {{LauncherAM}}) less
# since the Spark driver and executor are always launched in different JVMs, we 
don't have any interference of environment variables, driver, or executor 
classpath



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (OOZIE-3286) [spark-action] Call SparkLauncher from Oozie server JVM

Reply via email to