[
https://issues.apache.org/jira/browse/HIVE-14162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16558366#comment-16558366
]
Sahil Takiar commented on HIVE-14162:
-------------------------------------
[~naven084k] thanks for taking a look. A new Spark session is created for each
Hive session, so yes each Hive user has their own Spark session. Spark sessions
are integrated with impersonation similar to HoMR. So if {{hs2.enable.doAs}} is
true, then the Spark session is submitted to YARN as the user.
I agree, having an extra thread per Spark session would introduce increased
overhead. Instead, we can have a single thread that iterates over all the
current Spark Sessions and checks if they need to be closed, similar to how the
regular Hive session timeout logic is implemented. I will post an updated patch
soon.
> Allow disabling of long running job on Hive On Spark On YARN
> ------------------------------------------------------------
>
> Key: HIVE-14162
> URL: https://issues.apache.org/jira/browse/HIVE-14162
> Project: Hive
> Issue Type: Sub-task
> Components: Spark
> Reporter: Thomas Scott
> Assignee: Sahil Takiar
> Priority: Major
> Attachments: HIVE-14162.1.patch, HIVE-14162.2.patch,
> HIVE-14162.3.patch
>
>
> Hive On Spark launches a long running process on the first query to handle
> all queries for that user session. In some use cases this is not desired, for
> instance when using Hue with large intervals between query executions.
> Could we have a property that would cause long running spark jobs to be
> terminated after each query execution and started again for the next one?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)