Hello cos,

I have a few questions inline!


On Tue, Jul 16, 2013 at 6:53 AM, Konstantin Boudnik <[email protected]> wrote:

> Hi Matei.
>
> The reason I am using Maven for Bigtop packaging and not SBT is because the
> the former's dependency management is clean and let me build a proper
> assembly
> with only relevant dependencies: e.g. no Hadoop if I don't need to, etc
>

Isn't this achievable using SBT? I think it should be possible to define
tasksets for that. Then we should be able to do something we do with
maven(mvn -Pwithout-Hadoop) like sbt package-wo-hadoop etc. I am not a SBT
Ninja but I have seen it somewhere it is possible to extend tasks. I guess
https://github.com/harrah/xsbt/wiki/Getting-Started-Custom-Settings#extending-but-not-replacing-a-task
 !


> I don't hold onto the packaging the way it is done in the current maven
> build,
> because of the use of the Shader plugin: I believe flattening project
> dependencies is a suboptimal way to go.
>
> I am glad that you're calling to cease the use of classifiers. Big +1 on
> that!
> Using alternative names or versions to reflect dependency differences is
> certainly a great idea!
>
> I, perhaps, don't know much about SBT, but I think it is trying to solve
> Maven
> rigidity the way the Gradle did. However, the latter is introducing a
> well-defined DSL and integrates with Maven/Ant more transparently than SBT
> does.
>
> That said, I would love to stick with more mature build system, that is
> also
> wider accepted in Java community. But if the people involved into the
> project
> want to go with SBT as a build platform - that will work from Bigtop
> standpoint of view as far as we'd able to get a sensible set of libraries
> for
> further packaging (a-la https://github.com/mesos/spark/pull/675).
>
> Hope it helps,
>   Cos
>
> Do you have any other concern apart from the dependency management ? IMHO
two build systems are difficult to maintain with sophisticated build
configurations.



> On Mon, Jul 15, 2013 at 05:41PM, Matei Zaharia wrote:
> > Hi all,
> >
> > I wanted to bring up a topic that there isn't a 100% perfect solution
> for,
> > but that's been bothering the team at Berkeley for a while: consolidating
> > Spark's build system. Right now we have two build systems, Maven and SBT,
> > that need to be maintained together on each change. We added Maven a
> while
> > back to try it as an alternative to SBT and to get some better publishing
> > options, like Debian packages and classifiers, but we've found that 1)
> SBT
> > has actually been fairly stable since then (unlike the rapid release
> cycle
> > before) and 2) classifiers don't actually seem to work for publishing
> > versions of Spark with different dependencies (you need to give them
> > different artifact names). More importantly though, because maintaining
> two
> > systems is confusing, it would be good to converge to just one soon, or
> to
> > find a better way of maintaining the builds.
> >
> > In terms of which system to go for, neither is perfect, but I think many
> of
> > us are leaning toward SBT, because it's noticeably faster and it has less
> > code to maintain. If we do this, however, I'd really like to understand
> the
> > use cases for Maven, and make sure that either we can support them in
> SBT or
> > we can do them externally. Can people say a bit about that? The ones I've
> > thought of are the following:
> >
> > - Debian packaging -- this is certainly nice, but there are some plugins
> for
> > SBT too so may be possible to migrate.  - BigTop integration; I'm not
> sure
> > how much this relies on Maven but Cos has been using it.
> > - Classifiers for hadoop1 and hadoop2 -- as far as I can tell, these
> don't
> > really work if you want to publish to Maven Central; you still need two
> > artifact names because the artifacts have different dependencies.
> However,
> > more importantly, we'd like to make Spark work with all Hadoop versions
> by
> > using hadoop-client and a bit of reflection, similar to how projects like
> > Parquet handle this.
> >
> > Are there other things I'm missing here, or other ways to handle this
> > problem that I'm missing? For example, one possibility would be to keep
> the
> > Maven build scripts in a separate repo managed by the people who want to
> use
> > them, or to have some dedicated maintainers for them. But because this is
> > often an issue, I do think it would be simpler for the project to have
> one
> > build system in the long term. In either case though, we will keep the
> > project structure compatible with Maven, so people who want to use it
> > internally should be fine; I think that we've done this well and, if
> > anything, we've simplified the Maven build process lately by removing
> Twirl.
> >
> > Anyway, as I said, I don't think any solution is perfect here, but I'm
> > curious to hear your input.
> >
> > Matei
>



-- 
Prashant

Reply via email to