Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/1813#issuecomment-51423938
hey @vanzin - so yeah, in the past we've done this by creating separate
dependencies and publishing them to maven central, this is a bit more work on
our side, but it plays nicer for downstream users because the correct
functioning of spark does not depend on our distribution artifacts. For
instance, users who write applications that embed spark, they don't have to try
to recreate or mimic our distribution artifacts. For this reason I have a
slight bias for the existing approach, just because we've used it for a long
time without any issues.
The way we do this type of shading has been ad-hoc, we've typically cloned
the relevant projects, re-written class names and then published them to maven
central under the org.spark-project namespace.
On the dev list I recently explained to @avati that it would be nice to
include scripts in the Spark repo that do this. He's actually starting working
on this and you can see an example for protobuf - I think the stuff he's done
so far looks great:
https://github.com/avati/spark-shaded/blob/master/0001-protobuf.sh
I downloaded guava the other day and played around, I think it will be
pretty straightforward to do this for guava as well.
I actually know several other projects that have started to use our shaded
artifacts (we tend to shade dependencies that are a common pain point) - so I
think there is some value in continuing to do this.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]