[ 
https://issues.apache.org/jira/browse/HIVE-14162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16558366#comment-16558366
 ] 

Sahil Takiar commented on HIVE-14162:
-------------------------------------

[~naven084k] thanks for taking a look. A new Spark session is created for each 
Hive session, so yes each Hive user has their own Spark session. Spark sessions 
are integrated with impersonation similar to HoMR. So if {{hs2.enable.doAs}} is 
true, then the Spark session is submitted to YARN as the user.

I agree, having an extra thread per Spark session would introduce increased 
overhead. Instead, we can have a single thread that iterates over all the 
current Spark Sessions and checks if they need to be closed, similar to how the 
regular Hive session timeout logic is implemented. I will post an updated patch 
soon.

> Allow disabling of long running job on Hive On Spark On YARN
> ------------------------------------------------------------
>
>                 Key: HIVE-14162
>                 URL: https://issues.apache.org/jira/browse/HIVE-14162
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Thomas Scott
>            Assignee: Sahil Takiar
>            Priority: Major
>         Attachments: HIVE-14162.1.patch, HIVE-14162.2.patch, 
> HIVE-14162.3.patch
>
>
> Hive On Spark launches a long running process on the first query to handle 
> all queries for that user session. In some use cases this is not desired, for 
> instance when using Hue with large intervals between query executions.
> Could we have a property that would cause long running spark jobs to be 
> terminated after each query execution and started again for the next one?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to