I'd like to chime in as one of downstream integrators (Bigtop, etc.). As the
original author of Maven packaging assemblies I might be able shed some light
on the history behind of it:

  - in order to integrate Spark well into existing Hadoop stack it was
    necessary to have a way to avoid transitive dependencies duplications and
    possible conflicts.

    E.g. Maven assembly allows us to avoid adding _all_ Hadoop libs and later
    merely declare Spark package dependency on standard Bigtop Hadoop
    packages. And yes - Bigtop packaging means the naming and layout would be
    standard across all commercial Hadoop distributions that are worth
    mentioning: ASF Bigtop convenience binary packages, and Cloudera or
    Hortonworks packages. Hence, the downstream user doesn't need to spend any
    effort to make sure that Spark "clicks-in" properly.

  - Maven provides a relatively easy way to deal with the jar-hell problem,
    although the original maven build was just Shader'ing everything into a
    huge lump of class files. Oftentimes ending up with classes slamming on
    top of each other from different transitive dependencies.

Artifact publishing isn't a deciding concern when it comes to Sbt vs Maven: it
seems to be a no-brainer in both cases. I don't know Sbt that well to say that
its assemblies do not or can not provide the same level of segregation as
Maven's, but it seems this way. And that along is the huge blocker of dropping
the support of Maven build.

Now, what's the great deal of benefits supplemented by Sbt?

Regards,
    Cos

On Thu, Feb 20, 2014 at 08:03PM, Patrick Wendell wrote:
> Hey All,
> 
> It's very high overhead having two build systems in Spark. Before
> getting into a long discussion about the merits of sbt vs maven, I
> wanted to pose a simple question to the dev list:
> 
> Is there anyone who feels that dropping either sbt or maven would have
> a major consequence for them?
> 
> And I say "major consequence" meaning something becomes completely
> impossible now and can't be worked around. This is different from an
> "inconvenience", i.e., something which can be worked around but will
> require some investment.
> 
> I'm posing the question in this way because, if there are features in
> either build system that are absolutely-un-available in the other,
> then we'll have to maintain both for the time being. I'm merely trying
> to see whether this is the case...
> 
> - Patrick

Reply via email to