[
https://issues.apache.org/jira/browse/SPARK-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14063380#comment-14063380
]
Sean Owen commented on SPARK-2420:
----------------------------------
I think Jetty is the only actual issue here. Guava actually won't be an issue
if it's set to 11 in Spark, or downstream. (Spark does not use anything in
Guava 12+, or else, it wouldn't work in some Hadoop contexts.)
Servlet 3.0 really truly does work for all of this. The trick is actually
removing all the other copies of Servlet 2.5!
However, Jetty versions could be a real stumbling block. I'd like to focus on
that. There are basically two namespaces that Jetty uses, from its older
incarnation and newer versions. You have to harmonize both. What does Hive need
vs what's in Spark?
The reason I have some hope Xuefu is that obviously Spark already works in an
assembly with Hive classes. However, it may not be quite the same version, and,
I know in one case hive-exec had to be shaded because it is not available as a
non-assembly jar. There are some devils in the details.
Having seen a lot of this first-hand here, I can try to help. Can this be
elaborated with specific problems?
> Change Spark build to minimize library conflicts
> ------------------------------------------------
>
> Key: SPARK-2420
> URL: https://issues.apache.org/jira/browse/SPARK-2420
> Project: Spark
> Issue Type: Wish
> Components: Build
> Affects Versions: 1.0.0
> Reporter: Xuefu Zhang
> Attachments: spark_1.0.0.patch
>
>
> During the prototyping of HIVE-7292, many library conflicts showed up because
> Spark build contains versions of libraries that's vastly different from
> current major Hadoop version. It would be nice if we can choose versions
> that's in line with Hadoop or shading them in the assembly. Here are the wish
> list:
> 1. Upgrade protobuf version to 2.5.0 from current 2.4.1
> 2. Shading Spark's jetty and servlet dependency in the assembly.
> 3. guava version difference. Spark is using a higher version. I'm not sure
> what's the best solution for this.
> The list may grow as HIVE-7292 proceeds.
> For information only, the attached is a patch that we applied on Spark in
> order to make Spark work with Hive. It gives an idea of the scope of changes.
--
This message was sent by Atlassian JIRA
(v6.2#6252)