[jira] [Commented] (HIVE-15860) RemoteSparkJobMonitor may hang when RemoteDriver exits abnormally

Rui Li (JIRA) Tue, 10 Oct 2017 02:50:01 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198434#comment-16198434
 ]


Rui Li commented on HIVE-15860:
-------------------------------

Hi [~stakiar], I agree it's good to make QUEUED/SENT fail faster. But I still 
want to avoid the check in "normal" cases because as you said, each RPC call is 
doing the check already. Anyway, please feel free to open the JIRA.

> RemoteSparkJobMonitor may hang when RemoteDriver exits abnormally
> -----------------------------------------------------------------
>
>                 Key: HIVE-15860
>                 URL: https://issues.apache.org/jira/browse/HIVE-15860
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Rui Li
>            Assignee: Rui Li
>             Fix For: 2.3.0
>
>         Attachments: HIVE-15860.1.patch, HIVE-15860.2.patch, 
> HIVE-15860.2.patch
>
>
> It happens when RemoteDriver crashes between {{JobStarted}} and 
> {{JobSubmitted}}, e.g. killed by {{kill -9}}. RemoteSparkJobMonitor will 
> consider the job has started, however it can't get the job info because it 
> hasn't received the JobId. Then the monitor will loop forever.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15860) RemoteSparkJobMonitor may hang when RemoteDriver exits abnormally

Reply via email to