[
https://issues.apache.org/jira/browse/HADOOP-18670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kiran N updated HADOOP-18670:
-----------------------------
Description:
The issue I'm going to describe happens with the distribution: Spark 3.3.2 (git
revision 5103e00c4c) built for Hadoop 3.3.2
Based on this, as per my understanding, from Hadoop v3, there shouldn't be any
conflict between the Hadoop's and Spark app's dependencies. But, I see a
runtime failure with my spark app because of this conflict. Pasting the stack
trace below:
{{Caused by: java.lang.NoSuchMethodError:
com.google.common.collect.Sets.newConcurrentHashSet()Ljava/util/Set;}}
{{ at org.apache.cassandra.config.Config.<init>(Config.java:102)}}
{{ at
org.apache.cassandra.config.DatabaseDescriptor.clientInitialization(DatabaseDescriptor.java:288)}}
{{ at
org.apache.cassandra.io.sstable.CQLSSTableWriter.<clinit>(CQLSSTableWriter.java:109)}}
{{ at
com.<redacted>.spark.cassandra.bulkload.GameRecommendationsSSTWriter.init(GameRecommendationsSSTWriter.java:60)}}
{{ at
com.<redacted>.spark.cassandra.bulkload.GameRecommendationsSSTWriter.<init>(GameRecommendationsSSTWriter.java:23)}}
{{ at
com.<redacted>.spark.cassandra.bulkload.CassandraBulkLoad.execute(CassandraBulkLoad.java:93)}}
{{ at
com.<redacted>.spark.cassandra.bulkload.CassandraBulkLoad.main(CassandraBulkLoad.java:60)}}
{{ at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)}}
{{ at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)}}
{{ at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}}
{{ at java.lang.reflect.Method.invoke(Method.java:498)}}
{{ at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:740)}}
My Spark app has a transitive dependency on Guava library. It depends on
cassandra-all lib, which does on guava lib. The jar guava-14.0.1 that comes in
"spark-3.3.2-bin-hadoop3/jars" directory is a decade old and doesn't have
{{Sets.newConcurrentHashSet() }}method. I'm able to run the spark app
successfully by deleting that old version of guava jar from /jar directory and
by including a recent version in my project's pom.xml.
was:
The issue I'm going to describe happens with the distribution: Spark 3.3.2 (git
revision 5103e00c4c) built for Hadoop 3.3.2
Based on this, as per my understanding, from Hadoop v3, there shouldn't be any
conflict between the Hadoop's and Spark app's dependencies. But, I see a
runtime failure with my spark app because of this conflict. Pasting the stack
trace below:
{{Caused by: java.lang.NoSuchMethodError:
com.google.common.collect.Sets.newConcurrentHashSet()Ljava/util/Set;}}
{{ at org.apache.cassandra.config.Config.<init>(Config.java:102)}}
{{ at
org.apache.cassandra.config.DatabaseDescriptor.clientInitialization(DatabaseDescriptor.java:288)}}
{{ at
org.apache.cassandra.io.sstable.CQLSSTableWriter.<clinit>(CQLSSTableWriter.java:109)}}
{{ at
com.<redacted>.spark.cassandra.bulkload.GameRecommendationsSSTWriter.init(GameRecommendationsSSTWriter.java:60)}}
{{ at
com.<redacted>.spark.cassandra.bulkload.GameRecommendationsSSTWriter.<init>(GameRecommendationsSSTWriter.java:23)}}
{{ at
com.<redacted>.spark.cassandra.bulkload.CassandraBulkLoad.execute(CassandraBulkLoad.java:93)}}
{{ at
com.<redacted>.spark.cassandra.bulkload.CassandraBulkLoad.main(CassandraBulkLoad.java:60)}}
{{ at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)}}
{{ at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)}}
{{ at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}}
{{ at java.lang.reflect.Method.invoke(Method.java:498)}}
{{ at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:740)}}
My Spark app has a transitive dependency (it depends on cassandra-all lib,
which does on guava) on Guava library. The jar guava-14.0.1 that comes in
spark-3.3.2-bin-hadoop3/jars is a decade old and doesn't have
{{{}Sets.newConcurrentHashSet(){}}}. I'm able to run the spark app successfully
by deleting that old version of guava jar from /jar directory and by including
a recent version in my project's pom.xml.
> Spark application's dependency conflicts with Hadoop's dependency
> -----------------------------------------------------------------
>
> Key: HADOOP-18670
> URL: https://issues.apache.org/jira/browse/HADOOP-18670
> Project: Hadoop Common
> Issue Type: Bug
> Components: common
> Affects Versions: 3.3.2
> Reporter: Kiran N
> Priority: Blocker
>
> The issue I'm going to describe happens with the distribution: Spark 3.3.2
> (git revision 5103e00c4c) built for Hadoop 3.3.2
> Based on this, as per my understanding, from Hadoop v3, there shouldn't be
> any conflict between the Hadoop's and Spark app's dependencies. But, I see a
> runtime failure with my spark app because of this conflict. Pasting the stack
> trace below:
> {{Caused by: java.lang.NoSuchMethodError:
> com.google.common.collect.Sets.newConcurrentHashSet()Ljava/util/Set;}}
> {{ at org.apache.cassandra.config.Config.<init>(Config.java:102)}}
> {{ at
> org.apache.cassandra.config.DatabaseDescriptor.clientInitialization(DatabaseDescriptor.java:288)}}
> {{ at
> org.apache.cassandra.io.sstable.CQLSSTableWriter.<clinit>(CQLSSTableWriter.java:109)}}
> {{ at
> com.<redacted>.spark.cassandra.bulkload.GameRecommendationsSSTWriter.init(GameRecommendationsSSTWriter.java:60)}}
> {{ at
> com.<redacted>.spark.cassandra.bulkload.GameRecommendationsSSTWriter.<init>(GameRecommendationsSSTWriter.java:23)}}
> {{ at
> com.<redacted>.spark.cassandra.bulkload.CassandraBulkLoad.execute(CassandraBulkLoad.java:93)}}
> {{ at
> com.<redacted>.spark.cassandra.bulkload.CassandraBulkLoad.main(CassandraBulkLoad.java:60)}}
> {{ at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)}}
> {{ at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)}}
> {{ at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}}
> {{ at java.lang.reflect.Method.invoke(Method.java:498)}}
> {{ at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:740)}}
> My Spark app has a transitive dependency on Guava library. It depends on
> cassandra-all lib, which does on guava lib. The jar guava-14.0.1 that comes
> in "spark-3.3.2-bin-hadoop3/jars" directory is a decade old and doesn't have
> {{Sets.newConcurrentHashSet() }}method. I'm able to run the spark app
> successfully by deleting that old version of guava jar from /jar directory
> and by including a recent version in my project's pom.xml.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]