Hey Ram, One other important thing. I'm not sure exactly what your use case is, but if you are just planning to make a job that you submit to a Spark cluster, you can avoid bundling Spark in your assembly jar in the first place.
You should list the dependency as provided. See here: http://spark.incubator.apache.org/docs/latest/quick-start.html#including-your-dependencies libraryDependencies += "org.apache.spark" %% "spark-core" % "0.8.0-incubating" % "provided" The caveat here is that you'll need to launch your job itself with "sbt/sbt run"... or somehow else include Spark in the classpath when launching it. - Patrick On Sun, Oct 27, 2013 at 1:32 PM, Patrick Wendell <[email protected]> wrote: > Forgot to include user list... > > On Sun, Oct 27, 2013 at 1:16 PM, Patrick Wendell <[email protected]> wrote: >> When you are creating an assembly jar you need to deal with all merge >> conflicts, including those that arise as a result of transitive >> dependencies. Unfortunately there is no way for us to "publish" our >> merge strategy. Though this might be something we should mention in >> the docs... when people are making an assembly jar including Spark, >> they need to merge things correctly themselves. >> >> - Patrick >> >> On Sun, Oct 27, 2013 at 10:04 AM, Ramkumar Chokkalingam >> <[email protected]> wrote: >>> Hey Patrick, >>> >>> Thanks for the mail. So, I did solve the issue by using the MergeStrategy . >>> I was more concerned because Spark was only major dependency - Since it was >>> a prebuild version and failed, I wanted to check with you. >>> >>> >>> Regards, >>> >>> Ramkumar Chokkalingam , >>> University of Washington. >>> LinkedIn >>> >>> >>> >>> >>> >>> On Sun, Oct 27, 2013 at 9:33 AM, Patrick Wendell <[email protected]> wrote: >>>> >>>> Hey Ram, >>>> >>>> When you create the assembly Jar for your own project, you'll need to >>>> deal with all possible conflicts. And this includes various conflicts >>>> inside of Spark's dependencies. >>>> >>>> I've noticed the Apache commons libraries often have conflicts. One >>>> thing you could do is set MergeStrategy.first for all class fils with >>>> the path org/apache/common/*. >>>> >>>> You can also add a catch-all default strategy, like the one in the >>>> Spark build. In fact, it might make sense to just copy over the policy >>>> from the Spark build as a starting point: >>>> >>>> >>>> https://github.com/apache/incubator-spark/blob/master/project/SparkBuild.scala#L332 >>>> >>>> - Patrick >>>> >>>> >>>> On Sun, Oct 27, 2013 at 1:04 AM, Ramkumar Chokkalingam >>>> <[email protected]> wrote: >>>> > >>>> > Hello Spark Community, >>>> > >>>> > I'm trying to convert my project into a single JAR. I used sbt assembly >>>> > utility to do the same. >>>> > >>>> > This is the error I got, >>>> > >>>> > [error] (*:assembly) deduplicate: different file contents found in the >>>> > following: >>>> > [error] >>>> > /usr/local/spark/ram_examples/rm/lib_managed/jars/org.mortbay.jetty/servlet-api/servlet-api-2.5-20081211.jar:javax/servlet/SingleThreadModel.class >>>> > [error] >>>> > /usr/local/spark/ram_examples/rm/lib_managed/orbits/org.eclipse.jetty.orbit/javax.servlet/javax.servlet-2.5.0.v201103041518.jar:javax/servlet/SingleThreadModel.class >>>> > >>>> > So when I fix that with, >>>> > https://github.com/sbt/sbt-assembly#merge-strategy, i face another >>>> > dependency, >>>> > >>>> > java.lang.RuntimeException: deduplicate: different file contents found >>>> > in the following: >>>> > >>>> > /usr/local/spark/ram_examples/rm/lib_managed/jars/commons-beanutils/commons-beanutils-core/commons-beanutils-core-1.8.0.jar:org/apache/commons/beanutils/converters/FloatArrayConverter.class >>>> > >>>> > /usr/local/spark/ram_examples/rm/lib_managed/jars/commons-beanutils/commons-beanutils/commons-beanutils-1.7.0.jar:org/apache/commons/beanutils/converters/FloatArrayConverter.class >>>> > >>>> > This is my build file http://pastebin.com/5W9f1g1e >>>> > >>>> > While this seems like an error that has been already discussed and >>>> > solved, and discussed here . I get this in latest Spark [version 0.8.0]. >>>> > I'm >>>> > just curious because, this is clean build of Spark that I'm using and it >>>> > seems to work fine (with sbt run/sbt package),but when I use sbt >>>> > assembly I >>>> > get this error. Am I missing something while creating JAR ? Any help >>>> > would >>>> > be appreciated. Thanks! >>>> > >>>> > >>>> > Regards, >>>> > R >>>> > am >>>> > >>>> > >>> >>>
