[ 
https://issues.apache.org/jira/browse/HIVE-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317034#comment-16317034
 ] 

Xuefu Zhang commented on HIVE-16484:
------------------------------------

I'd echo with [~lirui], wondering the benefits the proposal brings. While I 
only gave a brief look on the patch, but from the conversations I found that 
SparkLauncher doesn't really offer all the advantages that are listed in the 
description. Rather, it brings uncertainty and possible stability issues in 
Hive.

We have been using HoS using spark-submit for our production. While it bears 
some imperfection (like launching a dummy process), it works for us. I'd feel 
nervous in completely different code path which is so critical. Moreover, 
security related stuff will need more testing at least.

Having said that, I'd suggest we keep existing implementation of Spark job 
submission. If we want to test out SparkLauncher, I think we can use it to 
replace the other code path where class {{org.apache.spark.deploy.SparkSubmit}} 
is directly invoked(, if that makes sense at all).

When SparkLauncher becomes mature and capable of replacing {{bin/spark-submit}} 
with the promised benefits, we can make a switch in later releases, which 
hopefully brings no impact to Hive on Spark users.

> Investigate SparkLauncher for HoS as alternative to bin/spark-submit
> --------------------------------------------------------------------
>
>                 Key: HIVE-16484
>                 URL: https://issues.apache.org/jira/browse/HIVE-16484
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>         Attachments: HIVE-16484.1.patch, HIVE-16484.10.patch, 
> HIVE-16484.2.patch, HIVE-16484.3.patch, HIVE-16484.4.patch, 
> HIVE-16484.5.patch, HIVE-16484.6.patch, HIVE-16484.7.patch, 
> HIVE-16484.8.patch, HIVE-16484.9.patch
>
>
> The {{SparkClientImpl#startDriver}} currently looks for the {{SPARK_HOME}} 
> directory and invokes the {{bin/spark-submit}} script, which spawns a 
> separate process to run the Spark application.
> {{SparkLauncher}} was added in SPARK-4924 and is a programatic way to launch 
> Spark applications.
> I see a few advantages:
> * No need to spawn a separate process to launch a HoS --> lower startup time
> * Simplifies the code in {{SparkClientImpl}} --> easier to debug
> * {{SparkLauncher#startApplication}} returns a {{SparkAppHandle}} which 
> contains some useful utilities for querying the state of the Spark job
> ** It also allows the launcher to specify a list of job listeners



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to