[ 
https://issues.apache.org/jira/browse/HIVE-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-10434:
----------------------------
    Attachment: HIVE-10434.4-spark.patch

Addressing RB comments #2.

> Cancel connection when remote Spark driver process has failed [Spark Branch] 
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-10434
>                 URL: https://issues.apache.org/jira/browse/HIVE-10434
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>    Affects Versions: 1.2.0
>            Reporter: Chao Sun
>            Assignee: Chao Sun
>         Attachments: HIVE-10434.1-spark.patch, HIVE-10434.3-spark.patch, 
> HIVE-10434.4-spark.patch
>
>
> Currently in HoS, in SparkClientImpl it first launch a remote Driver process, 
> and then wait for it to connect back to the HS2. However, in certain 
> situations (for instance, permission issue), the remote process may fail and 
> exit with error code. In this situation, the HS2 process will still wait for 
> the process to connect, and wait for a full timeout period before it throws 
> the exception.
> What makes it worth, user may need to wait for two timeout periods: one for 
> the SparkSetReducerParallelism, and another for the actual Spark job. This 
> could be very annoying.
> We should cancel the timeout task once we found out that the process has 
> failed, and set the promise as failed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to