GitHub user liancheng opened a pull request:
https://github.com/apache/spark/pull/7929
[SPARK-9593] [SQL] Fixes Hadoop shims loading
This PR is used to workaround CDH Hadoop versions like 2.0.0-mr1-cdh4.1.1.
Internally, Hive `ShimLoader` tries to load different versions of Hadoop
shims by checking version information gathered from Hadoop jar files. If the
major version number is 1, `Hadoop20SShims` will be loaded. Otherwise, if the
major version number is 2, `Hadoop23Shims` will be chosen. However, CDH Hadoop
versions like 2.0.0-mr1-cdh4.1.1 have 2 as major version number, but contain
Hadoop 1 code. This confuses Hive `ShimLoader` and loads wrong version of
shims.
In this PR we check for existence of the
`Path.getPathWithoutSchemeAndAuthority` method, which doesn't exist in Hadoop 1
(it's also the method that reveals this shims loading issue), and load
`Hadoop20SShims` when it doesn't exist.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/liancheng/spark
spark-9593/fix-hadoop-shims-loading
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/7929.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #7929
----
commit 077f5f2f825194a2eb01f72f5a9b3136bedcea17
Author: Cheng Lian <[email protected]>
Date: 2015-08-04T10:07:20Z
Fixes Hadoop shims loading
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]