[ 
https://issues.apache.org/jira/browse/SPARK-6906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated SPARK-6906:
-------------------------------
    Issue Type: Epic  (was: Story)

> Improve Hive integration support
> --------------------------------
>
>                 Key: SPARK-6906
>                 URL: https://issues.apache.org/jira/browse/SPARK-6906
>             Project: Spark
>          Issue Type: Epic
>          Components: SQL
>            Reporter: Michael Armbrust
>            Assignee: Michael Armbrust
>            Priority: Blocker
>
> Right now Spark SQL is very coupled to a specific version of Hive for two 
> primary reasons.
>  - Metadata: we use the Hive Metastore client to retrieve information about 
> tables in a metastore.
>  - Execution: UDFs, UDAFs, SerDes, HiveConf and various helper functions for 
> configuration.
> Since Hive is generally not compatible across versions, we are currently 
> maintain fairly expensive shim layers to let us talk to both Hive 12 and Hive 
> 13 metastores.  Ideally we would be able to talk to more versions of Hive 
> with less maintenance burden.
> This task is proposing that we separate the hive version that is used for 
> communicating with the metastore from the version that is used for execution. 
>  In doing so we can significantly reduce the size of the shim by only 
> providing compatibility for metadata operations.  All execution will be done 
> with single version of Hive (the newest version that is supported by Spark 
> SQL).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to