What version of sbt are you using? There is a bug in early version of 0.13 that causes assembly to be extremely slow - make sure you're using the latest one.
On Fri, Aug 29, 2014 at 1:30 PM, Aris <arisofala...@gmail.com> wrote: > Hi folks, > > I am trying to use Kafka with Spark Streaming, and it appears I cannot do > the normal 'sbt package' as I do with other Spark applications, such as > Spark alone or Spark with MLlib. I learned I have to build with the > sbt-assembly plugin. > > OK, so here is my build.sbt file for my extremely simple test Kafka/Spark > Streaming project. It Takes almost 30 minutes to build! This is a Centos > Linux machine on SSDs with 4GB of RAM, it's never been slow for me. To > compare, sbt assembly for the entire Spark project itself takes less than > 10 minutes. > > At the bottom of this file I am trying to play with 'cacheOutput' options, > because I read online that maybe I am calculating SHA-1 for all the *.class > files in this super JAR. > > I also copied the mergeStrategy from Spark contributor TD Spark Streaming > tutorial from Spark Summit 2014. > > Again, is there some better way to build this JAR file, just using sbt > package? This is process is working, but very slow. > > Any help with speeding up this compilation is really appreciated!! > > Aris > > ----------------------------------------- > > import AssemblyKeys._ // put this at the top of the file > > name := "streamingKafka" > > version := "1.0" > > scalaVersion := "2.10.4" > > libraryDependencies ++= Seq( > "org.apache.spark" %% "spark-core" % "1.0.1" % "provided", > "org.apache.spark" %% "spark-streaming" % "1.0.1" % "provided", > "org.apache.spark" %% "spark-streaming-kafka" % "1.0.1" > ) > > assemblySettings > > jarName in assembly := "streamingkafka-assembly.jar" > > mergeStrategy in assembly := { > case m if m.toLowerCase.endsWith("manifest.mf") => > MergeStrategy.discard > case m if m.toLowerCase.matches("meta-inf.*\\.sf$") => > MergeStrategy.discard > case "log4j.properties" => > MergeStrategy.discard > case m if m.toLowerCase.startsWith("meta-inf/services/") => > MergeStrategy.filterDistinctLines > case "reference.conf" => > MergeStrategy.concat > case _ => > MergeStrategy.first > } > > assemblyOption in assembly ~= { _.copy(cacheOutput = false) } > >