[
https://issues.apache.org/jira/browse/SPARK-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074596#comment-14074596
]
Marcelo Vanzin commented on SPARK-2420:
---------------------------------------
After some brainstorming, the path of least resistance seems to be downgrading
Guava to match Hadoop's version. Guava 14 and 11 are reasonably compatible, and
as Sean's patches show, not a lot of changes are needed in Spark.
This does mean that it's possible for people who are depending on Spark's Guava
dependency to run into problems. These people would have to modify their builds
to explicitly depend on the Guava they need, and either:
* use {{spark.files.userClassPathFirst}} when submitting their apps (and making
sure their needed Guava is packaged with their app or provided as a separate
jar)
* shade their version of Guava in their app
The latter means they won't override Spark's version of Guava at runtime, which
could cause weird bugs to show up. Alternatively, if those are deemed not
acceptable, we could build something like MAPREDUCE-1700 in Spark to isolate
the application's classpath from Spark's, so that some of these conflicts are
avoided.
How do people feel about this approach?
> Change Spark build to minimize library conflicts
> ------------------------------------------------
>
> Key: SPARK-2420
> URL: https://issues.apache.org/jira/browse/SPARK-2420
> Project: Spark
> Issue Type: Wish
> Components: Build
> Affects Versions: 1.0.0
> Reporter: Xuefu Zhang
> Attachments: spark_1.0.0.patch
>
>
> During the prototyping of HIVE-7292, many library conflicts showed up because
> Spark build contains versions of libraries that's vastly different from
> current major Hadoop version. It would be nice if we can choose versions
> that's in line with Hadoop or shading them in the assembly. Here are the wish
> list:
> 1. Upgrade protobuf version to 2.5.0 from current 2.4.1
> 2. Shading Spark's jetty and servlet dependency in the assembly.
> 3. guava version difference. Spark is using a higher version. I'm not sure
> what's the best solution for this.
> The list may grow as HIVE-7292 proceeds.
> For information only, the attached is a patch that we applied on Spark in
> order to make Spark work with Hive. It gives an idea of the scope of changes.
--
This message was sent by Atlassian JIRA
(v6.2#6252)