[
https://issues.apache.org/jira/browse/SPARK-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14072259#comment-14072259
]
Marcelo Vanzin commented on SPARK-2420:
---------------------------------------
Hi Sean,
I agree in part about the brokenness of such apps. In part, because I think
they're broken because Spark makes it easy for them to be. Also, I think you
slightly misread what I wrote, so let me explain.
For applications that, let's say, depend on Guava 17, nothing will change with
your patch. Such applications already need an explicit dependency on that
particular version to build, and need runtime options like
{{spark.files.userClassPathFirst}} to be set for things to work.
But for applications that depend on the same version of Guava that Spark
bundles, none of that is true. They get the dependency transitively, and the
class files are available at runtime in the Spark jar, and it's the right
version (since Spark needs to make sure they come before Hadoop's version in
the classpath, otherwise Spark itself might not work). So if you downgrade the
Guava version bundled with Spark, you might break those applications. So yes,
technically, they're already "broken" because Guava is not Spark, but it's very
easy to make that mistake.
> Change Spark build to minimize library conflicts
> ------------------------------------------------
>
> Key: SPARK-2420
> URL: https://issues.apache.org/jira/browse/SPARK-2420
> Project: Spark
> Issue Type: Wish
> Components: Build
> Affects Versions: 1.0.0
> Reporter: Xuefu Zhang
> Attachments: spark_1.0.0.patch
>
>
> During the prototyping of HIVE-7292, many library conflicts showed up because
> Spark build contains versions of libraries that's vastly different from
> current major Hadoop version. It would be nice if we can choose versions
> that's in line with Hadoop or shading them in the assembly. Here are the wish
> list:
> 1. Upgrade protobuf version to 2.5.0 from current 2.4.1
> 2. Shading Spark's jetty and servlet dependency in the assembly.
> 3. guava version difference. Spark is using a higher version. I'm not sure
> what's the best solution for this.
> The list may grow as HIVE-7292 proceeds.
> For information only, the attached is a patch that we applied on Spark in
> order to make Spark work with Hive. It gives an idea of the scope of changes.
--
This message was sent by Atlassian JIRA
(v6.2#6252)