[jira] [Updated] (HIVE-16984) HoS: avoid waiting for RemoteSparkJobStatus::getAppID() when remote driver died

Shohei Okumiya (Jira) Sun, 13 Apr 2025 23:38:52 -0700


     [ 
https://issues.apache.org/jira/browse/HIVE-16984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Shohei Okumiya updated HIVE-16984:
----------------------------------
    Fix Version/s: NA
       Resolution: Won't Fix
           Status: Resolved  (was: Patch Available)

We have discontinued Hive on Spark and EoLed Hive 3. HIVE-26134

> HoS: avoid waiting for RemoteSparkJobStatus::getAppID() when remote driver 
> died
> -------------------------------------------------------------------------------
>
>                 Key: HIVE-16984
>                 URL: https://issues.apache.org/jira/browse/HIVE-16984
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>            Reporter: Chao Sun
>            Assignee: Chao Sun
>            Priority: Major
>             Fix For: NA
>
>         Attachments: HIVE-16984.1.patch
>
>
> In HoS, after a RemoteDriver is launched, it may fail to initialize a Spark 
> context and thus the ApplicationMaster will die eventually. In this case, 
> there are two issues related to RemoteSparkJobStatus::getAppID():
> 1. Currently we call {{getAppID()}} before starting the monitoring job. For 
> the first, it will wait for {{hive.spark.client.future.timeout}}, and for the 
> latter, it will wait for {{hive.spark.job.monitor.timeout}}. The error 
> message for the latter treats the {{hive.spark.job.monitor.timeout}} as the 
> time waiting for the job submission. However, this is inaccurate as it 
> doesn't include {{hive.spark.client.future.timeout}}.
> 2. In case the RemoteDriver suddenly died, currently we still may wait 
> hopelessly for the timeouts. This should potentially be avoided if we know 
> that the channel has closed between the client and remote driver.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-16984) HoS: avoid waiting for RemoteSparkJobStatus::getAppID() when remote driver died

Reply via email to