Kos - thanks for chiming in. Could you be more specific about what is
available in maven and not in sbt for these issues? I took a look at
the bigtop code relating to Spark. As far as I could tell [1] was the
main point of integration with the build system (maybe there are other
integration points)?

>   - in order to integrate Spark well into existing Hadoop stack it was
>     necessary to have a way to avoid transitive dependencies duplications and
>     possible conflicts.
>
>     E.g. Maven assembly allows us to avoid adding _all_ Hadoop libs and later
>     merely declare Spark package dependency on standard Bigtop Hadoop
>     packages. And yes - Bigtop packaging means the naming and layout would be
>     standard across all commercial Hadoop distributions that are worth
>     mentioning: ASF Bigtop convenience binary packages, and Cloudera or
>     Hortonworks packages. Hence, the downstream user doesn't need to spend any
>     effort to make sure that Spark "clicks-in" properly.

The sbt build also allows you to plug in a Hadoop version similar to
the maven build.

>
>   - Maven provides a relatively easy way to deal with the jar-hell problem,
>     although the original maven build was just Shader'ing everything into a
>     huge lump of class files. Oftentimes ending up with classes slamming on
>     top of each other from different transitive dependencies.

AFIAK we are only using the shade plug-in to deal with conflict
resolution in the assembly jar. These are dealt with in sbt via the
sbt assembly plug-in in an identical way. Is there a difference?

[1] 
https://git-wip-us.apache.org/repos/asf?p=bigtop.git;a=blob;f=bigtop-packages/src/common/spark/do-component-build;h=428540e0f6aa56cd7e78eb1c831aa7fe9496a08f;hb=master

Reply via email to