[jira] [Commented] (SPARK-2420) Change Spark build to minimize library conflicts

Marcelo Vanzin (JIRA) Fri, 25 Jul 2014 10:14:35 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074596#comment-14074596
 ]


Marcelo Vanzin commented on SPARK-2420:
---------------------------------------

After some brainstorming, the path of least resistance seems to be downgrading 
Guava to match Hadoop's version. Guava 14 and 11 are reasonably compatible, and 
as Sean's patches show, not a lot of changes are needed in Spark.

This does mean that it's possible for people who are depending on Spark's Guava 
dependency to run into problems. These people would have to modify their builds 
to explicitly depend on the Guava they need, and either:

* use {{spark.files.userClassPathFirst}} when submitting their apps (and making 
sure their needed Guava is packaged with their app or provided as a separate 
jar)
* shade their version of Guava in their app

The latter means they won't override Spark's version of Guava at runtime, which 
could cause weird bugs to show up. Alternatively, if those are deemed not 
acceptable, we could build something like MAPREDUCE-1700 in Spark to isolate 
the application's classpath from Spark's, so that some of these conflicts are 
avoided.

How do people feel about this approach?

> Change Spark build to minimize library conflicts
> ------------------------------------------------
>
>                 Key: SPARK-2420
>                 URL: https://issues.apache.org/jira/browse/SPARK-2420
>             Project: Spark
>          Issue Type: Wish
>          Components: Build
>    Affects Versions: 1.0.0
>            Reporter: Xuefu Zhang
>         Attachments: spark_1.0.0.patch
>
>
> During the prototyping of HIVE-7292, many library conflicts showed up because 
> Spark build contains versions of libraries that's vastly different from 
> current major Hadoop version. It would be nice if we can choose versions 
> that's in line with Hadoop or shading them in the assembly. Here are the wish 
> list:
> 1. Upgrade protobuf version to 2.5.0 from current 2.4.1
> 2. Shading Spark's jetty and servlet dependency in the assembly.
> 3. guava version difference. Spark is using a higher version. I'm not sure 
> what's the best solution for this.
> The list may grow as HIVE-7292 proceeds.
> For information only, the attached is a patch that we applied on Spark in 
> order to make Spark work with Hive. It gives an idea of the scope of changes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-2420) Change Spark build to minimize library conflicts

Reply via email to