GitHub user marmbrus opened a pull request:
https://github.com/apache/spark/pull/5851
[SPARK-6907][SQL] Isolated client for HiveMetastore
This PR adds initial support for loading multiple versions of Hive in a
single JVM and provides a common interface for extracting metadata from the
`HiveMetastoreClient` for a given version. This is accomplished by creating an
isolated `ClassLoader` that operates according to the following rules:
- __Shared Classes__: Java, Scala, logging, and Spark classes are
delegated to `baseClassLoader`
allowing the results of calls to the `ClientInterface` to be visible
externally.
- __Hive Classes__: new instances are loaded from `execJars`. These
classes are not
accessible externally due to their custom loading.
- __Barrier Classes__: Classes such as `ClientWrapper` are defined in
Spark but must link to a specific version of Hive. As a result, the bytecode
is acquired from the Spark `ClassLoader` but a new copy is created for each
instance of `IsolatedClientLoader`.
This new instance is able to see a specific version of hive without using
reflection where ever hive is consistent across versions. Since
this is a unique instance, it is not visible externally other than as a
generic
`ClientInterface`, unless `isolationOn` is set to `false`.
In addition to the unit tests, I have also tested this locally against
mysql instances of the Hive Metastore. I've also successfully ported Spark SQL
to run with this client, but due to the size of the changes, that will come in
a follow-up PR.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/marmbrus/spark isolatedClient
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/5851.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #5851
----
commit fc31864be5e232d4cd7c9551fe22a9b0e3c36a00
Author: Michael Armbrust <[email protected]>
Date: 2015-05-01T05:47:15Z
[SPARK-6907][SQL] Isolated client for HiveMetastore
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]