Hi,

I've gotten an application working with sbt-assembly and spark, thought I'd
present an option. In my experience, trying to bundle any of the Spark
libraries in your uber jar is going to be a major pain. There will be a lot
of deduplication to work through and even if you resolve them it can be
easy to do it incorrectly. I considered it an intractable problem. So the
alternative is to not include those jars in your uber jar. For this to work
you will need the same libraries on the classpath of your Spark cluster and
your driver program (if you are running that as an application and not just
using spark-submit).

As for your NoClassDefFoundError, you either are missing Joda Time in your
runtime classpath or have conflicting versions. It looks like something
related to AWS wants to use it. Check your uber jar to see if its including
the org/joda/time as well as the classpath of your spark cluster. For
example: I use the Spark 1.3.0 on Hadoop 1.x, which in the 'lib' directory
has an uber jar spark-assembly-1.3.0-hadoop1.0.4.jar. At one point in Spark
1.2 I found a conflict between httpclient versions that my uber jar pulled
in for AWS libraries and the one bundled in the spark uber jar. I hand
patched the spark uber jar to remove the offending httpclient bytecode to
resolve the issue. You may be facing a similar situation.

I hope that gives some ideas for resolving your issue.

Regards,
Rich

On Tue, Apr 14, 2015 at 1:14 PM, Mike Trienis <mike.trie...@orcsol.com>
wrote:

> Hi Vadim,
>
> After removing "provided" from "org.apache.spark" %%
> "spark-streaming-kinesis-asl" I ended up with huge number of deduplicate
> errors:
>
> https://gist.github.com/trienism/3d6f8d6b7ff5b7cead6a
>
> It would be nice if you could share some pieces of your mergeStrategy code
> for reference.
>
> Also, after adding "provided" back to "spark-streaming-kinesis-asl" and I
> submit the spark job with the spark-streaming-kinesis-asl jar file
>
> sh /usr/lib/spark/bin/spark-submit --verbose --jars
> lib/spark-streaming-kinesis-asl_2.10-1.2.0.jar --class com.xxx.DataConsumer
> target/scala-2.10/xxx-assembly-0.1-SNAPSHOT.jar
>
> I still end up with the following error...
>
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/joda/time/format/DateTimeFormat
> at com.amazonaws.auth.AWS4Signer.<clinit>(AWS4Signer.java:44)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at java.lang.Class.newInstance(Class.java:379)
>
> Has anyone else run into this issue?
>
>
>
> On Mon, Apr 13, 2015 at 6:46 PM, Vadim Bichutskiy <
> vadim.bichuts...@gmail.com> wrote:
>
>> I don't believe the Kinesis asl should be provided. I used mergeStrategy
>> successfully to produce an "uber jar."
>>
>> Fyi, I've been having trouble consuming data out of Kinesis with Spark
>> with no success :(
>> Would be curious to know if you got it working.
>>
>> Vadim
>>
>> On Apr 13, 2015, at 9:36 PM, Mike Trienis <mike.trie...@orcsol.com>
>> wrote:
>>
>> Hi All,
>>
>> I have having trouble building a fat jar file through sbt-assembly.
>>
>> [warn] Merging 'META-INF/NOTICE.txt' with strategy 'rename'
>> [warn] Merging 'META-INF/NOTICE' with strategy 'rename'
>> [warn] Merging 'META-INF/LICENSE.txt' with strategy 'rename'
>> [warn] Merging 'META-INF/LICENSE' with strategy 'rename'
>> [warn] Merging 'META-INF/MANIFEST.MF' with strategy 'discard'
>> [warn] Merging
>> 'META-INF/maven/com.thoughtworks.paranamer/paranamer/pom.properties' with
>> strategy 'discard'
>> [warn] Merging
>> 'META-INF/maven/com.thoughtworks.paranamer/paranamer/pom.xml' with strategy
>> 'discard'
>> [warn] Merging 'META-INF/maven/commons-dbcp/commons-dbcp/pom.properties'
>> with strategy 'discard'
>> [warn] Merging 'META-INF/maven/commons-dbcp/commons-dbcp/pom.xml' with
>> strategy 'discard'
>> [warn] Merging 'META-INF/maven/commons-pool/commons-pool/pom.properties'
>> with strategy 'discard'
>> [warn] Merging 'META-INF/maven/commons-pool/commons-pool/pom.xml' with
>> strategy 'discard'
>> [warn] Merging 'META-INF/maven/joda-time/joda-time/pom.properties' with
>> strategy 'discard'
>> [warn] Merging 'META-INF/maven/joda-time/joda-time/pom.xml' with strategy
>> 'discard'
>> [warn] Merging 'META-INF/maven/log4j/log4j/pom.properties' with strategy
>> 'discard'
>> [warn] Merging 'META-INF/maven/log4j/log4j/pom.xml' with strategy
>> 'discard'
>> [warn] Merging 'META-INF/maven/org.joda/joda-convert/pom.properties' with
>> strategy 'discard'
>> [warn] Merging 'META-INF/maven/org.joda/joda-convert/pom.xml' with
>> strategy 'discard'
>> [warn] Merging 'META-INF/maven/org.slf4j/slf4j-api/pom.properties' with
>> strategy 'discard'
>> [warn] Merging 'META-INF/maven/org.slf4j/slf4j-api/pom.xml' with strategy
>> 'discard'
>> [warn] Merging 'META-INF/maven/org.slf4j/slf4j-log4j12/pom.properties'
>> with strategy 'discard'
>> [warn] Merging 'META-INF/maven/org.slf4j/slf4j-log4j12/pom.xml' with
>> strategy 'discard'
>> [warn] Merging 'META-INF/services/java.sql.Driver' with strategy
>> 'filterDistinctLines'
>> [warn] Merging 'rootdoc.txt' with strategy 'concat'
>> [warn] Strategy 'concat' was applied to a file
>> [warn] Strategy 'discard' was applied to 17 files
>> [warn] Strategy 'filterDistinctLines' was applied to a file
>> [warn] Strategy 'rename' was applied to 4 files
>>
>> When submitting the spark application through the command
>>
>> sh /usr/lib/spark/bin/spark-submit -class com.xxx.ExampleClassName
>> target/scala-2.10/xxxx-snapshot.jar
>>
>> I end up the the following error,
>>
>> Exception in thread "main" java.lang.NoClassDefFoundError:
>> org/joda/time/format/DateTimeFormat
>> at com.amazonaws.auth.AWS4Signer.<clinit>(AWS4Signer.java:44)
>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>> at
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>> at java.lang.Class.newInstance(Class.java:379)
>> at com.amazonaws.auth.SignerFactory.createSigner(SignerFactory.java:119)
>> at
>> com.amazonaws.auth.SignerFactory.lookupAndCreateSigner(SignerFactory.java:105)
>> at com.amazonaws.auth.SignerFactory.getSigner(SignerFactory.java:78)
>> at
>> com.amazonaws.AmazonWebServiceClient.computeSignerByServiceRegion(AmazonWebServiceClient.java:307)
>> at
>> com.amazonaws.AmazonWebServiceClient.computeSignerByURI(AmazonWebServiceClient.java:280)
>> at
>> com.amazonaws.AmazonWebServiceClient.setEndpoint(AmazonWebServiceClient.java:160)
>> at
>> com.amazonaws.services.kinesis.AmazonKinesisClient.setEndpoint(AmazonKinesisClient.java:2102)
>> at
>> com.amazonaws.services.kinesis.AmazonKinesisClient.init(AmazonKinesisClient.java:216)
>> at
>> com.amazonaws.services.kinesis.AmazonKinesisClient.<init>(AmazonKinesisClient.java:202)
>> at
>> com.amazonaws.services.kinesis.AmazonKinesisClient.<init>(AmazonKinesisClient.java:175)
>> at
>> com.amazonaws.services.kinesis.AmazonKinesisClient.<init>(AmazonKinesisClient.java:155)
>> at com.quickstatsengine.aws.AwsProvider$.<init>(AwsProvider.scala:20)
>> at com.quickstatsengine.aws.AwsProvider$.<clinit>(AwsProvider.scala)
>>
>> The snippet from my build.sbt file is:
>>
>>         "org.apache.spark" %% "spark-core" % "1.2.0" % "provided",
>>         "org.apache.spark" %% "spark-streaming" % "1.2.0" % "provided",
>>         "com.datastax.spark" %% "spark-cassandra-connector" %
>> "1.2.0-alpha1" % "provided",
>>         "org.apache.spark" %% "spark-streaming-kinesis-asl" % "1.2.0" %
>> "provided",
>>
>> And the error is originating from:
>>
>> val kinesisClient = new AmazonKinesisClient(new
>> DefaultAWSCredentialsProviderChain())
>>
>> Am I correct to set spark-streaming-kinesis-asl as a *provided *dependency?
>> Also, is there a merge strategy I need apply?
>>
>> Any help would be appreciated, Mike.
>>
>>
>>
>

Reply via email to