Hello cos, I have a few questions inline!
On Tue, Jul 16, 2013 at 6:53 AM, Konstantin Boudnik <[email protected]> wrote: > Hi Matei. > > The reason I am using Maven for Bigtop packaging and not SBT is because the > the former's dependency management is clean and let me build a proper > assembly > with only relevant dependencies: e.g. no Hadoop if I don't need to, etc > Isn't this achievable using SBT? I think it should be possible to define tasksets for that. Then we should be able to do something we do with maven(mvn -Pwithout-Hadoop) like sbt package-wo-hadoop etc. I am not a SBT Ninja but I have seen it somewhere it is possible to extend tasks. I guess https://github.com/harrah/xsbt/wiki/Getting-Started-Custom-Settings#extending-but-not-replacing-a-task ! > I don't hold onto the packaging the way it is done in the current maven > build, > because of the use of the Shader plugin: I believe flattening project > dependencies is a suboptimal way to go. > > I am glad that you're calling to cease the use of classifiers. Big +1 on > that! > Using alternative names or versions to reflect dependency differences is > certainly a great idea! > > I, perhaps, don't know much about SBT, but I think it is trying to solve > Maven > rigidity the way the Gradle did. However, the latter is introducing a > well-defined DSL and integrates with Maven/Ant more transparently than SBT > does. > > That said, I would love to stick with more mature build system, that is > also > wider accepted in Java community. But if the people involved into the > project > want to go with SBT as a build platform - that will work from Bigtop > standpoint of view as far as we'd able to get a sensible set of libraries > for > further packaging (a-la https://github.com/mesos/spark/pull/675). > > Hope it helps, > Cos > > Do you have any other concern apart from the dependency management ? IMHO two build systems are difficult to maintain with sophisticated build configurations. > On Mon, Jul 15, 2013 at 05:41PM, Matei Zaharia wrote: > > Hi all, > > > > I wanted to bring up a topic that there isn't a 100% perfect solution > for, > > but that's been bothering the team at Berkeley for a while: consolidating > > Spark's build system. Right now we have two build systems, Maven and SBT, > > that need to be maintained together on each change. We added Maven a > while > > back to try it as an alternative to SBT and to get some better publishing > > options, like Debian packages and classifiers, but we've found that 1) > SBT > > has actually been fairly stable since then (unlike the rapid release > cycle > > before) and 2) classifiers don't actually seem to work for publishing > > versions of Spark with different dependencies (you need to give them > > different artifact names). More importantly though, because maintaining > two > > systems is confusing, it would be good to converge to just one soon, or > to > > find a better way of maintaining the builds. > > > > In terms of which system to go for, neither is perfect, but I think many > of > > us are leaning toward SBT, because it's noticeably faster and it has less > > code to maintain. If we do this, however, I'd really like to understand > the > > use cases for Maven, and make sure that either we can support them in > SBT or > > we can do them externally. Can people say a bit about that? The ones I've > > thought of are the following: > > > > - Debian packaging -- this is certainly nice, but there are some plugins > for > > SBT too so may be possible to migrate. - BigTop integration; I'm not > sure > > how much this relies on Maven but Cos has been using it. > > - Classifiers for hadoop1 and hadoop2 -- as far as I can tell, these > don't > > really work if you want to publish to Maven Central; you still need two > > artifact names because the artifacts have different dependencies. > However, > > more importantly, we'd like to make Spark work with all Hadoop versions > by > > using hadoop-client and a bit of reflection, similar to how projects like > > Parquet handle this. > > > > Are there other things I'm missing here, or other ways to handle this > > problem that I'm missing? For example, one possibility would be to keep > the > > Maven build scripts in a separate repo managed by the people who want to > use > > them, or to have some dedicated maintainers for them. But because this is > > often an issue, I do think it would be simpler for the project to have > one > > build system in the long term. In either case though, we will keep the > > project structure compatible with Maven, so people who want to use it > > internally should be fine; I think that we've done this well and, if > > anything, we've simplified the Maven build process lately by removing > Twirl. > > > > Anyway, as I said, I don't think any solution is perfect here, but I'm > > curious to hear your input. > > > > Matei > -- Prashant
