GitHub user marmbrus opened a pull request:

    https://github.com/apache/spark/pull/5851

    [SPARK-6907][SQL] Isolated client for HiveMetastore

    This PR adds initial support for loading multiple versions of Hive in a 
single JVM and provides a common interface for extracting metadata from the 
`HiveMetastoreClient` for a given version.  This is accomplished by creating an 
isolated `ClassLoader` that operates according to the following rules:
    
     - __Shared Classes__: Java, Scala, logging, and Spark classes are 
delegated to `baseClassLoader`
      allowing the results of calls to the `ClientInterface` to be visible 
externally.
     - __Hive Classes__: new instances are loaded from `execJars`.  These 
classes are not
      accessible externally due to their custom loading.
     - __Barrier Classes__: Classes such as `ClientWrapper` are defined in 
Spark but must link to a specific version of Hive.  As a result, the bytecode 
is acquired from the Spark `ClassLoader` but a new copy is created for each 
instance of `IsolatedClientLoader`.
      This new instance is able to see a specific version of hive without using 
reflection where ever hive is consistent across versions. Since
      this is a unique instance, it is not visible externally other than as a 
generic
      `ClientInterface`, unless `isolationOn` is set to `false`.
    
    In addition to the unit tests, I have also tested this locally against 
mysql instances of the Hive Metastore.  I've also successfully ported Spark SQL 
to run with this client, but due to the size of the changes, that will come in 
a follow-up PR.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/marmbrus/spark isolatedClient

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/5851.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5851
    
----
commit fc31864be5e232d4cd7c9551fe22a9b0e3c36a00
Author: Michael Armbrust <[email protected]>
Date:   2015-05-01T05:47:15Z

    [SPARK-6907][SQL] Isolated client for HiveMetastore

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to