Re: mvn or sbt for studying and developing Spark?

Mark Hamstra Sun, 16 Nov 2014 13:48:23 -0800

>
> The console mode of sbt (just run
> sbt/sbt and then a long running console session is started that will accept
> further commands) is great for building individual subprojects or running
> single test suites.  In addition to being faster since its a long running
> JVM, its got a lot of nice features like tab-completion for test case
> names.



We include the scala-maven-plugin in spark/pom.xml, so equivalent
functionality is available using Maven.  You can start a console session
with `mvn scala:console`.


On Sun, Nov 16, 2014 at 1:23 PM, Michael Armbrust <mich...@databricks.com>
wrote:

> I'm going to have to disagree here.  If you are building a release
> distribution or integrating with legacy systems then maven is probably the
> correct choice.  However most of the core developers that I know use sbt,
> and I think its a better choice for exploration and development overall.
> That said, this probably falls into the category of a religious argument so
> you might want to look at both options and decide for yourself.
>
> In my experience the SBT build is significantly faster with less effort
> (and I think sbt is still faster even if you go through the extra effort of
> installing zinc) and easier to read.  The console mode of sbt (just run
> sbt/sbt and then a long running console session is started that will accept
> further commands) is great for building individual subprojects or running
> single test suites.  In addition to being faster since its a long running
> JVM, its got a lot of nice features like tab-completion for test case
> names.
>
> For example, if I wanted to see what test cases are available in the SQL
> subproject you can do the following:
>
> [marmbrus@michaels-mbp spark (tpcds)]$ sbt/sbt
> [info] Loading project definition from
> /Users/marmbrus/workspace/spark/project/project
> [info] Loading project definition from
>
> /Users/marmbrus/.sbt/0.13/staging/ad8e8574a5bcb2d22d23/sbt-pom-reader/project
> [info] Set current project to spark-parent (in build
> file:/Users/marmbrus/workspace/spark/)
> > sql/test-only *<tab>*
> --
>  org.apache.spark.sql.CachedTableSuite
> org.apache.spark.sql.DataTypeSuite
>  org.apache.spark.sql.DslQuerySuite
> org.apache.spark.sql.InsertIntoSuite
> ...
>
> Another very useful feature is the development console, which starts an
> interactive REPL including the most recent version of the code and a lot of
> useful imports for some subprojects.  For example in the hive subproject it
> automatically sets up a temporary database with a bunch of test data
> pre-loaded:
>
> $ sbt/sbt hive/console
> > hive/console
> ...
> import org.apache.spark.sql.hive._
> import org.apache.spark.sql.hive.test.TestHive._
> import org.apache.spark.sql.parquet.ParquetTestData
> Welcome to Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java
> 1.7.0_45).
> Type in expressions to have them evaluated.
> Type :help for more information.
>
> scala> sql("SELECT * FROM src").take(2)
> res0: Array[org.apache.spark.sql.Row] = Array([238,val_238], [86,val_86])
>
> Michael
>
> On Sun, Nov 16, 2014 at 3:27 AM, Dinesh J. Weerakkody <
> dineshjweerakk...@gmail.com> wrote:
>
> > Hi Stephen and Sean,
> >
> > Thanks for correction.
> >
> > On Sun, Nov 16, 2014 at 12:28 PM, Sean Owen <so...@cloudera.com> wrote:
> >
> > > No, the Maven build is the main one.  I would use it unless you have a
> > > need to use the SBT build in particular.
> > > On Nov 16, 2014 2:58 AM, "Dinesh J. Weerakkody" <
> > > dineshjweerakk...@gmail.com> wrote:
> > >
> > >> Hi Yiming,
> > >>
> > >> I believe that both SBT and MVN is supported in SPARK, but SBT is
> > >> preferred
> > >> (I'm not 100% sure about this :) ). When I'm using MVN I got some
> build
> > >> failures. After that used SBT and works fine.
> > >>
> > >> You can go through these discussions regarding SBT vs MVN and learn
> pros
> > >> and cons of both [1] [2].
> > >>
> > >> [1]
> > >>
> > >>
> >
> http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Necessity-of-Maven-and-SBT-Build-in-Spark-td2315.html
> > >>
> > >> [2]
> > >>
> > >>
> >
> https://groups.google.com/forum/#!msg/spark-developers/OxL268v0-Qs/fBeBY8zmh3oJ
> > >>
> > >> Thanks,
> > >>
> > >> On Sun, Nov 16, 2014 at 7:11 AM, Yiming (John) Zhang <
> sdi...@gmail.com>
> > >> wrote:
> > >>
> > >> > Hi,
> > >> >
> > >> >
> > >> >
> > >> > I am new in developing Spark and my current focus is about
> > >> co-scheduling of
> > >> > spark tasks. However, I am confused with the building tools:
> sometimes
> > >> the
> > >> > documentation uses mvn but sometimes uses sbt.
> > >> >
> > >> >
> > >> >
> > >> > So, my question is that which one is the preferred tool of Spark
> > >> community?
> > >> > And what's the technical difference between them? Thank you!
> > >> >
> > >> >
> > >> >
> > >> > Cheers,
> > >> >
> > >> > Yiming
> > >> >
> > >> >
> > >>
> > >>
> > >> --
> > >> Thanks & Best Regards,
> > >>
> > >> *Dinesh J. Weerakkody*
> > >>
> > >
> >
> >
> > --
> > Thanks & Best Regards,
> >
> > *Dinesh J. Weerakkody*
> >
>

Re: mvn or sbt for studying and developing Spark?

Reply via email to