[
https://issues.apache.org/jira/browse/HIVE-20506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16605054#comment-16605054
]
Brock Noland commented on HIVE-20506:
-------------------------------------
[~stakiar] - perfect yes you understand correctly. Hive on MR will just wait
forever for the job to be submitted. The reason is that Hive on MR just does a
{{hadoop}} command execution and waits for that to return to decide if the job
failed or succeed. One MR job equates to one stage. HOS however starts a Spark
Application per user session and so one Spark application can run N stages or
even N queries.
Thus to fix this, I think we need to make HOS wait for the Spark App to
actually start before the handshake timeout starts counting down.
> HOS times out when cluster is full while Hive-on-MR waits
> ---------------------------------------------------------
>
> Key: HIVE-20506
> URL: https://issues.apache.org/jira/browse/HIVE-20506
> Project: Hive
> Issue Type: Improvement
> Reporter: Brock Noland
> Priority: Major
>
> My understanding is as follows:
> Hive-on-MR when the cluster is full will wait for resources to be available
> before submitting a job. This is because the hadoop jar command is the
> primary mechanism Hive uses to know if a job is complete or failed.
>
> Hive-on-Spark will timeout after {{SPARK_RPC_CLIENT_CONNECT_TIMEOUT}} because
> the RPC client in the AppMaster doesn't connect back to the RPC Server in
> HS2.
> This is a behavior difference it'd be great to close.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)