[
https://issues.apache.org/jira/browse/HIVE-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xuefu Zhang updated HIVE-10434:
-------------------------------
Fix Version/s: (was: spark-branch)
1.3.0
> Cancel connection when remote Spark driver process has failed [Spark Branch]
> -----------------------------------------------------------------------------
>
> Key: HIVE-10434
> URL: https://issues.apache.org/jira/browse/HIVE-10434
> Project: Hive
> Issue Type: Sub-task
> Components: Spark
> Affects Versions: 1.2.0
> Reporter: Chao Sun
> Assignee: Chao Sun
> Fix For: 1.3.0
>
> Attachments: HIVE-10434.1-spark.patch, HIVE-10434.3-spark.patch,
> HIVE-10434.4-spark.patch
>
>
> Currently in HoS, in SparkClientImpl it first launch a remote Driver process,
> and then wait for it to connect back to the HS2. However, in certain
> situations (for instance, permission issue), the remote process may fail and
> exit with error code. In this situation, the HS2 process will still wait for
> the process to connect, and wait for a full timeout period before it throws
> the exception.
> What makes it worth, user may need to wait for two timeout periods: one for
> the SparkSetReducerParallelism, and another for the actual Spark job. This
> could be very annoying.
> We should cancel the timeout task once we found out that the process has
> failed, and set the promise as failed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)