[jira] [Commented] (HIVE-14162) Allow disabling of long running job on Hive On Spark On YARN

Sahil Takiar (JIRA) Mon, 16 Jul 2018 12:52:24 -0700


    [ 
https://issues.apache.org/jira/browse/HIVE-14162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16545673#comment-16545673
 ]


Sahil Takiar commented on HIVE-14162:
-------------------------------------

The high level idea is to introduce a new config called 
{{hive.spark.session.timeout}} that has a default value of 30 minutes. If a 
Spark job hasn't been run in the past 30 minutes, the session gets closed. The 
timeout logic is implemented inside {{SparkSessionImpl}}. I've added a basic 
test called {{TestSparkSessionTimeout}}. I'm planning to improve the test a bit 
more and add a few more enhancements. As Beluga Behr pointed out, the benefit 
is that we reclaim resources from the HoS Driver. This is of particular concern 
for users who don't actively close their sessions (e.g. Hue users).

> Allow disabling of long running job on Hive On Spark On YARN
> ------------------------------------------------------------
>
>                 Key: HIVE-14162
>                 URL: https://issues.apache.org/jira/browse/HIVE-14162
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Thomas Scott
>            Assignee: Sahil Takiar
>            Priority: Major
>         Attachments: HIVE-14162.1.patch, HIVE-14162.2.patch
>
>
> Hive On Spark launches a long running process on the first query to handle 
> all queries for that user session. In some use cases this is not desired, for 
> instance when using Hue with large intervals between query executions.
> Could we have a property that would cause long running spark jobs to be 
> terminated after each query execution and started again for the next one?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-14162) Allow disabling of long running job on Hive On Spark On YARN

Reply via email to