[
https://issues.apache.org/jira/browse/HIVE-21096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Adam Szita reassigned HIVE-21096:
---------------------------------
> Remove unnecessary Spark dependency from HS2 process
> ----------------------------------------------------
>
> Key: HIVE-21096
> URL: https://issues.apache.org/jira/browse/HIVE-21096
> Project: Hive
> Issue Type: Improvement
> Components: HiveServer2, Spark
> Reporter: Adam Szita
> Assignee: Adam Szita
> Priority: Major
>
> When a HiveOnSpark job is kicked off most of the work is done by the
> RemoteDriver, which is a separate process. There a couple of smaller parts of
> code, where HS2 process depends on Spark jars, these for example include
> receiving stats from the driver or putting together a Spark conf object -
> used mostly during communication with RemoteDriver.
> We can limit the data types used for such communication so that we don't use
> (and serialize) types that are in Spark codebase, and hence we can refactor
> our code to only use Spark jars in the Remote Driver process.
> I think this way would be cleaner from dependencies point of view, and also
> less erroneous when users have to compile the classpath for their HS2
> processes.
> (E.g. due to a change between Spark 2.2 and 2.4 we had to also include
> spark-unsafe*.jar - though it's an internal change to Spark..)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)