[ 
https://issues.apache.org/jira/browse/HADOOP-18670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kiran N updated HADOOP-18670:
-----------------------------
    Description: 
The issue I'm going to describe happens with the distribution: Spark 3.3.2 (git 
revision 5103e00c4c) built for Hadoop 3.3.2

Based on [this|https://issues.apache.org/jira/browse/HADOOP-11804], as per my 
understanding, from Hadoop v3, there shouldn't be any conflict between the 
Hadoop's and Spark app's dependencies. But, I see a runtime runtime failure 
with my spark app because of this conflict. Pasting the stack trace below:

{{Caused by: java.lang.NoSuchMethodError: 
com.google.common.collect.Sets.newConcurrentHashSet()Ljava/util/Set;}}
{{    at org.apache.cassandra.config.Config.<init>(Config.java:102)}}
{{    at 
org.apache.cassandra.config.DatabaseDescriptor.clientInitialization(DatabaseDescriptor.java:288)}}
{{    at 
org.apache.cassandra.io.sstable.CQLSSTableWriter.<clinit>(CQLSSTableWriter.java:109)}}
{{    at 
com.<redacted>.spark.cassandra.bulkload.GameRecommendationsSSTWriter.init(GameRecommendationsSSTWriter.java:60)}}
{{    at 
com.<redacted>.spark.cassandra.bulkload.GameRecommendationsSSTWriter.<init>(GameRecommendationsSSTWriter.java:23)}}
{{    at 
com.<redacted>.spark.cassandra.bulkload.CassandraBulkLoad.execute(CassandraBulkLoad.java:93)}}
{{    at 
com.<redacted>.spark.cassandra.bulkload.CassandraBulkLoad.main(CassandraBulkLoad.java:60)}}
{{    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)}}
{{    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)}}
{{    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}}
{{    at java.lang.reflect.Method.invoke(Method.java:498)}}
{{    at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:740)}}

My Spark app has a transitive dependency (it depends on cassandra-all lib, 
which does on guava) on Guava library. The jar guava-14.0.1 that comes in 
spark-3.3.2-bin-hadoop3/jars is a decade old and doesn't have 
{{{}Sets.newConcurrentHashSet(){}}}. I'm able to run the spark app successfully 
by deleting that old version of guava jar from /jar directory and by including 
a recent version in my project's pom.xml.

  was:
The issue I'm going to describe happens with the distribution: Spark 3.3.2 (git 
revision 5103e00c4c) built for Hadoop 3.3.2

Based on this, as per my understanding, from Hadoop v3, there shouldn't be any 
conflict between the Hadoop's and Spark app's dependencies. But, I see a 
runtime runtime failure with my spark app because of this conflict. Pasting the 
stack trace below:

{{Caused by: java.lang.NoSuchMethodError: 
com.google.common.collect.Sets.newConcurrentHashSet()Ljava/util/Set;}}
{{    at org.apache.cassandra.config.Config.<init>(Config.java:102)}}
{{    at 
org.apache.cassandra.config.DatabaseDescriptor.clientInitialization(DatabaseDescriptor.java:288)}}
{{    at 
org.apache.cassandra.io.sstable.CQLSSTableWriter.<clinit>(CQLSSTableWriter.java:109)}}
{{    at 
com.<redacted>.spark.cassandra.bulkload.GameRecommendationsSSTWriter.init(GameRecommendationsSSTWriter.java:60)}}
{{    at 
com.<redacted>.spark.cassandra.bulkload.GameRecommendationsSSTWriter.<init>(GameRecommendationsSSTWriter.java:23)}}
{{    at 
com.<redacted>.spark.cassandra.bulkload.CassandraBulkLoad.execute(CassandraBulkLoad.java:93)}}
{{    at 
com.<redacted>.spark.cassandra.bulkload.CassandraBulkLoad.main(CassandraBulkLoad.java:60)}}
{{    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)}}
{{    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)}}
{{    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}}
{{    at java.lang.reflect.Method.invoke(Method.java:498)}}
{{    at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:740)}}

My Spark app has a transitive dependency (it depends on cassandra-all lib, 
which does on guava) on Guava library. The jar guava-14.0.1 that comes in 
spark-3.3.2-bin-hadoop3/jars is a decade old and doesn't have 
{{{}Sets.newConcurrentHashSet(){}}}. I'm able to run the spark app successfully 
by deleting that old version of guava jar from /jar directory and by including 
a recent version in my project's pom.xml.


> Spark application's dependency conflicts with Hadoop's dependency
> -----------------------------------------------------------------
>
>                 Key: HADOOP-18670
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18670
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: common
>    Affects Versions: 3.3.2
>            Reporter: Kiran N
>            Priority: Blocker
>
> The issue I'm going to describe happens with the distribution: Spark 3.3.2 
> (git revision 5103e00c4c) built for Hadoop 3.3.2
> Based on [this|https://issues.apache.org/jira/browse/HADOOP-11804], as per my 
> understanding, from Hadoop v3, there shouldn't be any conflict between the 
> Hadoop's and Spark app's dependencies. But, I see a runtime runtime failure 
> with my spark app because of this conflict. Pasting the stack trace below:
> {{Caused by: java.lang.NoSuchMethodError: 
> com.google.common.collect.Sets.newConcurrentHashSet()Ljava/util/Set;}}
> {{    at org.apache.cassandra.config.Config.<init>(Config.java:102)}}
> {{    at 
> org.apache.cassandra.config.DatabaseDescriptor.clientInitialization(DatabaseDescriptor.java:288)}}
> {{    at 
> org.apache.cassandra.io.sstable.CQLSSTableWriter.<clinit>(CQLSSTableWriter.java:109)}}
> {{    at 
> com.<redacted>.spark.cassandra.bulkload.GameRecommendationsSSTWriter.init(GameRecommendationsSSTWriter.java:60)}}
> {{    at 
> com.<redacted>.spark.cassandra.bulkload.GameRecommendationsSSTWriter.<init>(GameRecommendationsSSTWriter.java:23)}}
> {{    at 
> com.<redacted>.spark.cassandra.bulkload.CassandraBulkLoad.execute(CassandraBulkLoad.java:93)}}
> {{    at 
> com.<redacted>.spark.cassandra.bulkload.CassandraBulkLoad.main(CassandraBulkLoad.java:60)}}
> {{    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)}}
> {{    at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)}}
> {{    at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}}
> {{    at java.lang.reflect.Method.invoke(Method.java:498)}}
> {{    at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:740)}}
> My Spark app has a transitive dependency (it depends on cassandra-all lib, 
> which does on guava) on Guava library. The jar guava-14.0.1 that comes in 
> spark-3.3.2-bin-hadoop3/jars is a decade old and doesn't have 
> {{{}Sets.newConcurrentHashSet(){}}}. I'm able to run the spark app 
> successfully by deleting that old version of guava jar from /jar directory 
> and by including a recent version in my project's pom.xml.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to