If you are creating an assembly, make sure spark-streaming is marked as
provided. spark-streaming is already part of the spark installation so will
be present at run time. That might solve some of these, may be!?

TD

On Mon, Mar 16, 2015 at 11:30 AM, Kelly, Jonathan <jonat...@amazon.com>
wrote:

>  I'm attempting to use the Spark Kinesis Connector, so I've added the
> following dependency in my build.sbt:
>
>  libraryDependencies += "org.apache.spark" %%
> "spark-streaming-kinesis-asl" % "1.3.0"
>
>  My app works fine with "sbt run", but I can't seem to get "sbt assembly"
> to work without failing with "different file contents found" errors due
> to different versions of various packages getting pulled in to the
> assembly.  This only occurs when I've added spark-streaming-kinesis-asl as
> a dependency. "sbt assembly" works fine otherwise.
>
>  Here are the conflicts that I see:
>
>  com.esotericsoftware.kryo:kryo:2.21
>  com.esotericsoftware.minlog:minlog:1.2
>
>  com.google.guava:guava:15.0
>  org.apache.spark:spark-network-common_2.10:1.3.0
>
>  (Note: The conflict is with javac.sh; why is this even getting included?)
> org.apache.spark:spark-streaming-kinesis-asl_2.10:1.3.0
> org.apache.spark:spark-streaming_2.10:1.3.0
> org.apache.spark:spark-core_2.10:1.3.0
> org.apache.spark:spark-network-common_2.10:1.3.0
> org.apache.spark:spark-network-shuffle_2.10:1.3.0
>
>  (Note: I'm actually using my own custom-built version of Spark-1.3.0
> where I've upgraded to v1.9.24 of the AWS Java SDK, but that has nothing to
> do with all of these conflicts, as I upgraded the dependency *because* I
> was getting all of these conflicts with the Spark 1.3.0 artifacts from the
> central repo.)
>  com.amazonaws:aws-java-sdk-s3:1.9.24
>  net.java.dev.jets3t:jets3t:0.9.3
>
>  commons-collections:commons-collections:3.2.1
>  commons-beanutils-commons-beanutils:1.7.0
>  commons-beanutils:commons-beanutils-core:1.8.0
>
>  commons-logging:commons-logging:1.1.3
> org.slf4j:jcl-over-slf4j:1.7.10
>
>  (Note: The conflict is with a few package-info.class files, which seems
> really silly.)
> org.apache.hadoop:hadoop-yarn-common:2.4.0
> org.apache.hadoop:hadoop-yarn-api:2.4.0
>
>  (Note: The conflict is with org/apache/spark/unused/UnusedStubClass.class,
> which seems even more silly.)
>  org.apache.spark:spark-streaming-kinesis-asl_2.10:1.3.0
>  org.apache.spark:spark-streaming_2.10:1.3.0
>  org.apache.spark:spark-core_2.10:1.3.0
>  org.apache.spark:spark-network-common_2.10:1.3.0
>  org.spark-project.spark:unused:1.0.0 (?!?!?!)
>  org.apache.spark:spark-network-shuffle_2.10:1.3.0
>
>  I can get rid of some of the conflicts by using excludeAll() to exclude
> artifacts with organization = "org.apache.hadoop" or organization =
> "org.apache.spark" and name = "spark-streaming", and I might be able to
> resolve a few other conflicts this way, but the bottom line is that this is
> way more complicated than it should be, so either something is really
> broken or I'm just doing something wrong.
>
>  Many of these don't even make sense to me.  For example, the very first
> conflict is between classes in com.esotericsoftware.kryo:kryo:2.21 and in
> com.esotericsoftware.minlog:minlog:1.2, but the former *depends* upon the
> latter, so ???  It seems wrong to me that one package would contain
> different versions of the same classes that are included in one of its
> dependencies.  I guess it doesn't make too much difference though if I
> could only get my assembly to include/exclude the right packages.  I of
> course don't want any of the spark or hadoop dependencies included (other
> than spark-streaming-kinesis-asl itself), but I want all of
> spark-streaming-kinesis-asl's dependencies included (such as the AWS Java
> SDK and its dependencies).  That doesn't seem to be possible without what I
> imagine will become an unruly and fragile exclusion list though.
>
>
>  Thanks,
>
> Jonathan
>

Reply via email to