GitHub user JoshRosen opened a pull request:
https://github.com/apache/spark/pull/12171
[SPARK-14399] Remove unnecessary excludes from POMs and simplify Hive POM
This patch aims to simplify our build by removing a number of unnecessary
excludes from the build. The individual commit messages describe the changes
here in more detail, but there are a few key themes:
- **Remove many excludes in the root POM:** According to the [Maven
documentation](ttps://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Management),
"dependency management takes precedence over dependency mediation". In many
cases, we were both excluding transitive dependencies while also using those
dependencies in `spark-core` and pinning those dependencies' versions in the
`dependencyManagment` section of our POM.
As a rough guideline, I think like we should only exclude dependencies
under the following scenarios:
- We don't want the dependency at all, in which case we should define a
Maven Enforcer rule to ban it.
- We want to include _some_ version of the dependency but don't want to
explicitly specify a fixed version. For instance, imagine that libraries A and
B depend on some dependency C. We might exclude C from library B so that
library A's version of C becomes the final effective version.
- We want to include the dependency only when certain profiles are
enabled, so we ban it from all of our transitive dependencies and explicitly
re-add it in a profile.
- **Remove explicitly-promoted transitive dependencies from Hive POM:** in
an early revision of #7191, it looks like we tried to use the `core`-classified
version of `hive-exec`, which is published as a thin-JAR which does not bundle
third-party dependencies' classes. However, both the regular and classified
versions of the dependency share the same effective POM, which, in this case,
happens to be the dependency-reduced-POM produced by the Maven shade plugin. As
a result, users of the `core`-classified JAR must add direct dependencies on
what are logically transitive dependencies of `hive-exec`, so #7191 added
several such dependencies in `hive/pom.xml`.
However, by the end of #7191 we abandoned the idea of using an existing
Hive JAR and chose to republish our own JAR, so this promotion of transitive
dependencies is no longer necessary. As a result, this patch was able to
significantly cut down the size of the `hive` POM by removing those direct
dependencies. This had the side-effect of exposing a problem where one of our
Janino JARs was being pulled to a lower version as a side-effect of this
unnecessary dependency promotion.
- **Use Maven's new wildcard exclude support:** Maven now supports wildcard
exclusion patterns, so I used that to simplify the exclusion of `org.ow2.asm`
artifacts.
- **Improve `dev/test-dependencies`:** a bad regex caused the
`org.spark-project:hive` JARs to not be listed under `dev/deps`; this is now
fixed.
- **Remove `joda-time` dependency:** Spark does not directly depend on
`joda-time`, so I don't think that it makes sense to keep it in our POM.
- **Remove explicit `io.netty` dependency**: we don't seem to need this
anymore. I did leave the `dependencyManagement` entry for now in order to
prevent an unexpected downgrade.
/cc @srowen @steveloughran @pwendell @rxin for review.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/JoshRosen/spark build-cleanup
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/12171.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #12171
----
commit ca2aef540954d2a2b29c00fd792e1f73c8600e6d
Author: Josh Rosen <[email protected]>
Date: 2016-04-04T23:42:25Z
Update test-dependeices script to not accidentally filter out Spark hive
deps.
commit 87dd382710af1b2d9e89025590f66d9192154a1e
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T00:21:34Z
Remove explicit jodd dependency
commit ed40fa3d11a896ef08dcfb484488b1ad3628117a
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T00:25:22Z
Remove explicit Datanucleus dependency.
commit 38ed734c787692c59ff67e363e0d57e9b7348972
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T00:46:29Z
Add Maven Enforcer rule to document org.hsqldb:hsqldb exclusion
commit aec2f209abc6aa358564bf6669b6c86a2c16b3dc
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T01:07:02Z
Remove seemingly-unnecessary Calcite exclude.
commit 02dec1be34e7bfbe7079d6bb35ea55ea6d9a5467
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T01:18:13Z
Remove explicit Calcite dependency.
It seems that the explicit dep. was pulling our version of Janino's
commons-compiler
a bit lower than it should have been.
commit 1cf73d4894be4ab6c7f75e417761307b1693c46c
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T01:25:27Z
Remove unused Joda dependency.
commit e8409d02b67523806b81b5298cb74f53f982c4a1
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T01:34:50Z
Remove unnecessary Jackson excludes:
See
https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Management
> "dependency management takes precedence over dependency mediation"
commit bcd486792e33c6a2703de5d94d52db94f0484c4d
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T01:40:47Z
Remove unnecessary Jackson excludes.
commit 44ae3e07420a5164676d893ec96f083e01b9ae7a
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T01:42:55Z
Remove unnecessary jackson-mapper-asl direct dependency in Hive
commit e5c221e6e1626c15ef53b9a3918569796adac465
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T01:49:20Z
Remove unnecessary httpclient exclusions.
commit 3deb2c2af9cc6530d1c970480e725d710c02f782
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T02:15:06Z
Remove unnecessary jsr305 excludes.
commit 7a24ae86d08f420b5a17a3ab52e2a2e6ae37008f
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T02:19:08Z
Clean up commons-codec excludes.
commit 55e7cc8e89b1ce3a8faa3e41f4b0affae5e8c8af
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T02:20:00Z
Remove commented-out portions of the Hive build.
commit 482840a31d1675bd9caf9a0c7aa14404db91fdbc
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T02:23:39Z
Remove unnecessary Janino exclude.
commit 9eb49f2afbb5460b2ddc0d71a23a425f1aa9f03c
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T02:28:18Z
Remove unnecessary libfb303 direct dependency.
commit 35d539d6c58292739c256fba8aa089a1097a7b2e
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T02:35:00Z
Remove a bunch of unnecessary logging-related excludes.
commit 805f55be217f78df4288f4c91568487e4758c516
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T02:35:37Z
Remove a stray httpclient exclude.
commit 277e56a5e7a43dd73e49df7eb2e3efbd291d8257
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T03:06:57Z
Document org.ow2.asm exclusions.
commit 4f7c0c55e074d51822690bfdeafef1a9a37c8282
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T03:21:07Z
Clarify comments surrounding Hive's Avro dep. classifier.
commit 1abd1f0b6e118eeabcc6e20ef0d423fe06d4beda
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T05:01:31Z
Remove unnecessary io.netty exclusions; add clarifying comment.
commit 8eb56cd322b929c6cde271b6516879b7096bafc0
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T05:06:44Z
Add enforcer rule for excluding javax.servlet:servlet-api
commit 6b64539c164e3e3f9cf03e8ac05c205e7bd44ecd
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T05:09:42Z
Add enforcer rule for mockito-all exclusion.
commit b1809b7e17e8d02a42be4866b9351ad3ac1a22b8
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T06:11:39Z
Remove another stray slf4j exclude.
commit e5ca86e93508187b3c93d681ba907e43703535d1
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T06:21:55Z
More unnecessary exclude removal; reudce scope of Kryo exclude.
commit 393696cf4c9580782cb9753cebf8b74faeaabf30
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T06:30:04Z
Remove unnecessary Zookeeper excludes
commit 17afbe23b92520228e5d004648bd2d9f0a56e12c
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T06:36:28Z
Remove unnecessary curator excludes
commit d736e7e26863af4cc382fe644f9e60c6193f7687
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T06:39:31Z
Remove inherited exclusion.
commit 51853c7d18c6a85a669a2d62639dfba30ed29dcd
Author: Josh Rosen <[email protected]>
Date: 2016-04-05T06:53:27Z
Fix typo.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]