prepending jars to the driver class path for spark-submit on YARN

Penny Espinoza Fri, 05 Sep 2014 16:34:54 -0700

Hey - I’m struggling with some dependency issues with org.apache.httpcomponents 
httpcore and httpclient when using spark-submit with YARN running Spark 1.0.2 
on a Hadoop 2.2 cluster.  I’ve seen several posts about this issue, but no 
resolution.


The error message is this:


Caused by: java.lang.NoSuchMethodError: 
org.apache.http.impl.conn.DefaultClientConnectionOperator.<init>(Lorg/apache/http/conn/scheme/SchemeRegistry;Lorg/apache/http/conn/DnsResolver;)V
        at 
org.apache.http.impl.conn.PoolingClientConnectionManager.createConnectionOperator(PoolingClientConnectionManager.java:140)
        at 
org.apache.http.impl.conn.PoolingClientConnectionManager.<init>(PoolingClientConnectionManager.java:114)
        at 
org.apache.http.impl.conn.PoolingClientConnectionManager.<init>(PoolingClientConnectionManager.java:99)
        at 
org.apache.http.impl.conn.PoolingClientConnectionManager.<init>(PoolingClientConnectionManager.java:85)
        at 
org.apache.http.impl.conn.PoolingClientConnectionManager.<init>(PoolingClientConnectionManager.java:93)
        at 
com.amazonaws.http.ConnectionManagerFactory.createPoolingClientConnManager(ConnectionManagerFactory.java:26)
        at 
com.amazonaws.http.HttpClientFactory.createHttpClient(HttpClientFactory.java:96)
        at com.amazonaws.http.AmazonHttpClient.<init>(AmazonHttpClient.java:155)
        at 
com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:118)
        at 
com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:102)
        at 
com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:332)
        at 
com.oncue.rna.realtime.streaming.config.package$.transferManager(package.scala:76)
        at 
com.oncue.rna.realtime.streaming.models.S3SchemaRegistry.<init>(SchemaRegistry.scala:27)
        at 
com.oncue.rna.realtime.streaming.models.S3SchemaRegistry$.schemaRegistry$lzycompute(SchemaRegistry.scala:46)
        at 
com.oncue.rna.realtime.streaming.models.S3SchemaRegistry$.schemaRegistry(SchemaRegistry.scala:44)
        at 
com.oncue.rna.realtime.streaming.coders.KafkaAvroDecoder.<init>(KafkaAvroDecoder.scala:20)
        ... 17 more

The apache httpcomponents libraries include the method above as of version 4.2. 
 The Spark 1.0.2 binaries seem to include version 4.1.

I can get this to work in my driver program by adding exclusions to force use 
of 4.1, but then I get the error in tasks even when using the —jars option of 
the spark-submit command.  How can I get both the driver program and the 
individual tasks in my spark-streaming job to use the same version of this 
library so my job will run all the way through?

thanks
p

prepending jars to the driver class path for spark-submit on YARN

Reply via email to