Hi, I've gotten an application working with sbt-assembly and spark, thought I'd present an option. In my experience, trying to bundle any of the Spark libraries in your uber jar is going to be a major pain. There will be a lot of deduplication to work through and even if you resolve them it can be easy to do it incorrectly. I considered it an intractable problem. So the alternative is to not include those jars in your uber jar. For this to work you will need the same libraries on the classpath of your Spark cluster and your driver program (if you are running that as an application and not just using spark-submit).
As for your NoClassDefFoundError, you either are missing Joda Time in your runtime classpath or have conflicting versions. It looks like something related to AWS wants to use it. Check your uber jar to see if its including the org/joda/time as well as the classpath of your spark cluster. For example: I use the Spark 1.3.0 on Hadoop 1.x, which in the 'lib' directory has an uber jar spark-assembly-1.3.0-hadoop1.0.4.jar. At one point in Spark 1.2 I found a conflict between httpclient versions that my uber jar pulled in for AWS libraries and the one bundled in the spark uber jar. I hand patched the spark uber jar to remove the offending httpclient bytecode to resolve the issue. You may be facing a similar situation. I hope that gives some ideas for resolving your issue. Regards, Rich On Tue, Apr 14, 2015 at 1:14 PM, Mike Trienis <mike.trie...@orcsol.com> wrote: > Hi Vadim, > > After removing "provided" from "org.apache.spark" %% > "spark-streaming-kinesis-asl" I ended up with huge number of deduplicate > errors: > > https://gist.github.com/trienism/3d6f8d6b7ff5b7cead6a > > It would be nice if you could share some pieces of your mergeStrategy code > for reference. > > Also, after adding "provided" back to "spark-streaming-kinesis-asl" and I > submit the spark job with the spark-streaming-kinesis-asl jar file > > sh /usr/lib/spark/bin/spark-submit --verbose --jars > lib/spark-streaming-kinesis-asl_2.10-1.2.0.jar --class com.xxx.DataConsumer > target/scala-2.10/xxx-assembly-0.1-SNAPSHOT.jar > > I still end up with the following error... > > Exception in thread "main" java.lang.NoClassDefFoundError: > org/joda/time/format/DateTimeFormat > at com.amazonaws.auth.AWS4Signer.<clinit>(AWS4Signer.java:44) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at java.lang.Class.newInstance(Class.java:379) > > Has anyone else run into this issue? > > > > On Mon, Apr 13, 2015 at 6:46 PM, Vadim Bichutskiy < > vadim.bichuts...@gmail.com> wrote: > >> I don't believe the Kinesis asl should be provided. I used mergeStrategy >> successfully to produce an "uber jar." >> >> Fyi, I've been having trouble consuming data out of Kinesis with Spark >> with no success :( >> Would be curious to know if you got it working. >> >> Vadim >> >> On Apr 13, 2015, at 9:36 PM, Mike Trienis <mike.trie...@orcsol.com> >> wrote: >> >> Hi All, >> >> I have having trouble building a fat jar file through sbt-assembly. >> >> [warn] Merging 'META-INF/NOTICE.txt' with strategy 'rename' >> [warn] Merging 'META-INF/NOTICE' with strategy 'rename' >> [warn] Merging 'META-INF/LICENSE.txt' with strategy 'rename' >> [warn] Merging 'META-INF/LICENSE' with strategy 'rename' >> [warn] Merging 'META-INF/MANIFEST.MF' with strategy 'discard' >> [warn] Merging >> 'META-INF/maven/com.thoughtworks.paranamer/paranamer/pom.properties' with >> strategy 'discard' >> [warn] Merging >> 'META-INF/maven/com.thoughtworks.paranamer/paranamer/pom.xml' with strategy >> 'discard' >> [warn] Merging 'META-INF/maven/commons-dbcp/commons-dbcp/pom.properties' >> with strategy 'discard' >> [warn] Merging 'META-INF/maven/commons-dbcp/commons-dbcp/pom.xml' with >> strategy 'discard' >> [warn] Merging 'META-INF/maven/commons-pool/commons-pool/pom.properties' >> with strategy 'discard' >> [warn] Merging 'META-INF/maven/commons-pool/commons-pool/pom.xml' with >> strategy 'discard' >> [warn] Merging 'META-INF/maven/joda-time/joda-time/pom.properties' with >> strategy 'discard' >> [warn] Merging 'META-INF/maven/joda-time/joda-time/pom.xml' with strategy >> 'discard' >> [warn] Merging 'META-INF/maven/log4j/log4j/pom.properties' with strategy >> 'discard' >> [warn] Merging 'META-INF/maven/log4j/log4j/pom.xml' with strategy >> 'discard' >> [warn] Merging 'META-INF/maven/org.joda/joda-convert/pom.properties' with >> strategy 'discard' >> [warn] Merging 'META-INF/maven/org.joda/joda-convert/pom.xml' with >> strategy 'discard' >> [warn] Merging 'META-INF/maven/org.slf4j/slf4j-api/pom.properties' with >> strategy 'discard' >> [warn] Merging 'META-INF/maven/org.slf4j/slf4j-api/pom.xml' with strategy >> 'discard' >> [warn] Merging 'META-INF/maven/org.slf4j/slf4j-log4j12/pom.properties' >> with strategy 'discard' >> [warn] Merging 'META-INF/maven/org.slf4j/slf4j-log4j12/pom.xml' with >> strategy 'discard' >> [warn] Merging 'META-INF/services/java.sql.Driver' with strategy >> 'filterDistinctLines' >> [warn] Merging 'rootdoc.txt' with strategy 'concat' >> [warn] Strategy 'concat' was applied to a file >> [warn] Strategy 'discard' was applied to 17 files >> [warn] Strategy 'filterDistinctLines' was applied to a file >> [warn] Strategy 'rename' was applied to 4 files >> >> When submitting the spark application through the command >> >> sh /usr/lib/spark/bin/spark-submit -class com.xxx.ExampleClassName >> target/scala-2.10/xxxx-snapshot.jar >> >> I end up the the following error, >> >> Exception in thread "main" java.lang.NoClassDefFoundError: >> org/joda/time/format/DateTimeFormat >> at com.amazonaws.auth.AWS4Signer.<clinit>(AWS4Signer.java:44) >> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) >> at >> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) >> at >> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) >> at java.lang.reflect.Constructor.newInstance(Constructor.java:526) >> at java.lang.Class.newInstance(Class.java:379) >> at com.amazonaws.auth.SignerFactory.createSigner(SignerFactory.java:119) >> at >> com.amazonaws.auth.SignerFactory.lookupAndCreateSigner(SignerFactory.java:105) >> at com.amazonaws.auth.SignerFactory.getSigner(SignerFactory.java:78) >> at >> com.amazonaws.AmazonWebServiceClient.computeSignerByServiceRegion(AmazonWebServiceClient.java:307) >> at >> com.amazonaws.AmazonWebServiceClient.computeSignerByURI(AmazonWebServiceClient.java:280) >> at >> com.amazonaws.AmazonWebServiceClient.setEndpoint(AmazonWebServiceClient.java:160) >> at >> com.amazonaws.services.kinesis.AmazonKinesisClient.setEndpoint(AmazonKinesisClient.java:2102) >> at >> com.amazonaws.services.kinesis.AmazonKinesisClient.init(AmazonKinesisClient.java:216) >> at >> com.amazonaws.services.kinesis.AmazonKinesisClient.<init>(AmazonKinesisClient.java:202) >> at >> com.amazonaws.services.kinesis.AmazonKinesisClient.<init>(AmazonKinesisClient.java:175) >> at >> com.amazonaws.services.kinesis.AmazonKinesisClient.<init>(AmazonKinesisClient.java:155) >> at com.quickstatsengine.aws.AwsProvider$.<init>(AwsProvider.scala:20) >> at com.quickstatsengine.aws.AwsProvider$.<clinit>(AwsProvider.scala) >> >> The snippet from my build.sbt file is: >> >> "org.apache.spark" %% "spark-core" % "1.2.0" % "provided", >> "org.apache.spark" %% "spark-streaming" % "1.2.0" % "provided", >> "com.datastax.spark" %% "spark-cassandra-connector" % >> "1.2.0-alpha1" % "provided", >> "org.apache.spark" %% "spark-streaming-kinesis-asl" % "1.2.0" % >> "provided", >> >> And the error is originating from: >> >> val kinesisClient = new AmazonKinesisClient(new >> DefaultAWSCredentialsProviderChain()) >> >> Am I correct to set spark-streaming-kinesis-asl as a *provided *dependency? >> Also, is there a merge strategy I need apply? >> >> Any help would be appreciated, Mike. >> >> >> >