Michael Armbrust created SPARK-6906:
---------------------------------------

             Summary: Refactor Connection to Hive Metastore
                 Key: SPARK-6906
                 URL: https://issues.apache.org/jira/browse/SPARK-6906
             Project: Spark
          Issue Type: Bug
          Components: SQL
            Reporter: Michael Armbrust
            Assignee: Michael Armbrust
            Priority: Blocker


Right now Spark SQL is very coupled to a specific version of Hive for two 
primary reasons.
 - Metadata: we use the Hive Metastore client to retrieve information about 
tables in a metastore.
 - Execution: UDFs, UDAFs, SerDes, HiveConf and various helper functions for 
configuration.

Since Hive is generally not compatible across versions, we are currently 
maintain fairly expensive shim layers to let us talk to both Hive 12 and Hive 
13 metastores.  Ideally we would be able to talk to more versions of Hive with 
less maintenance burden.

This task is proposing that we separate the hive version that is used for 
communicating with the metastore from the version that is used for execution.  
In doing so we can significantly reduce the size of the shim by only providing 
compatibility for metadata operations.  All execution will be done with single 
version of Hive (the newest version that is supported by Spark SQL).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to