[ 
https://issues.apache.org/jira/browse/SPARK-27781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Heuer updated SPARK-27781:
----------------------------------
    Description: 
It appears that there is a conflict in avro dependency versions at runtime when 
using Spark 2.4.3 and Scala 2.12 (spark-2.4.3-bin-without-hadoop-scala-2.12 
binary distribution) and Hadoop 2.7.7.

 

Specifically, the Spark 2.4.3 binary distribution for Hadoop 2.7.x includes 
avro-1.8.2.jar

{{$ find spark-2.4.3-bin-hadoop2.7 *.jar | grep avro}}

{{jars/avro-1.8.2.jar}}

{{jars/avro-mapred-1.8.2-hadoop2.jar}}

{{jars/avro-ipc-1.8.2.jar}}

 

Whereas the Spark 2.4.3 binary distribution for Scala 2.12 without Hadoop does 
not

{{$ find spark-2.4.3-bin-without-hadoop-scala-2.12 *.jar | grep avro}}

{{jars/avro-mapred-1.8.2-hadoop2.jar}}

 

Including Hadoop 2.7.7 onto the classpath brings in avro-1.7.4.jar, which 
conflicts at runtime

{{$ find hadoop-2.7.7 -name *.jar | grep avro}}

{{share/hadoop/mapreduce/lib/avro-1.7.4.jar}}

{{share/hadoop/kms/tomcat/webapps/kms/WEB-INF/lib/avro-1.7.4.jar}}

{{share/hadoop/tools/lib/avro-1.7.4.jar}}

{{share/hadoop/common/lib/avro-1.7.4.jar}}

{{hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/avro-1.7.4.jar}}

 

Issue filed downstream in

[https://github.com/bigdatagenomics/adam/issues/2151]

 

Attached a smaller reproducing test case.

  was:
It appears that there is a conflict in avro dependency versions at runtime when 
using Spark 2.4.3 and Scala 2.12 (spark-2.4.3-bin-without-hadoop-scala-2.12 
binary distribution) and Hadoop 2.7.7.

 

Specifically, the Spark 2.4.3 binary distribution for Hadoop 2.7.x includes 
avro-1.8.2.jar 

{{$ find spark-2.4.3-bin-hadoop2.7 *.jar | grep avro}}

{{jars/avro-1.8.2.jar}}

{{jars/avro-mapred-1.8.2-hadoop2.jar}}

{{jars/avro-ipc-1.8.2.jar}}

 

Whereas the Spark 2.4.3 binary distribution for Scala 2.12 without Hadoop does 
not

{{$ find spark-2.4.3-bin-without-hadoop-scala-2.12 *.jar | grep avro}}

{{jars/avro-mapred-1.8.2-hadoop2.jar}}

 

Including Hadoop 2.7.7 onto the classpath brings in avro-1.7.4.jar, which 
conflicts at runtime

{{$ find hadoop-2.7.7 -name *.jar | grep avro}}

{{share/hadoop/mapreduce/lib/avro-1.7.4.jar}}

{{share/hadoop/kms/tomcat/webapps/kms/WEB-INF/lib/avro-1.7.4.jar}}

{{share/hadoop/tools/lib/avro-1.7.4.jar}}

{{share/hadoop/common/lib/avro-1.7.4.jar}}

{{hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/avro-1.7.4.jar}}

 

Issue filed downstream in

[https://github.com/bigdatagenomics/adam/issues/2151]

 

Will try to create a smaller reproducing test case.


> Tried to access method org.apache.avro.specific.SpecificData.<init>()V
> ----------------------------------------------------------------------
>
>                 Key: SPARK-27781
>                 URL: https://issues.apache.org/jira/browse/SPARK-27781
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.4.3
>            Reporter: Michael Heuer
>            Priority: Major
>         Attachments: reproduce.sh
>
>
> It appears that there is a conflict in avro dependency versions at runtime 
> when using Spark 2.4.3 and Scala 2.12 
> (spark-2.4.3-bin-without-hadoop-scala-2.12 binary distribution) and Hadoop 
> 2.7.7.
>  
> Specifically, the Spark 2.4.3 binary distribution for Hadoop 2.7.x includes 
> avro-1.8.2.jar
> {{$ find spark-2.4.3-bin-hadoop2.7 *.jar | grep avro}}
> {{jars/avro-1.8.2.jar}}
> {{jars/avro-mapred-1.8.2-hadoop2.jar}}
> {{jars/avro-ipc-1.8.2.jar}}
>  
> Whereas the Spark 2.4.3 binary distribution for Scala 2.12 without Hadoop 
> does not
> {{$ find spark-2.4.3-bin-without-hadoop-scala-2.12 *.jar | grep avro}}
> {{jars/avro-mapred-1.8.2-hadoop2.jar}}
>  
> Including Hadoop 2.7.7 onto the classpath brings in avro-1.7.4.jar, which 
> conflicts at runtime
> {{$ find hadoop-2.7.7 -name *.jar | grep avro}}
> {{share/hadoop/mapreduce/lib/avro-1.7.4.jar}}
> {{share/hadoop/kms/tomcat/webapps/kms/WEB-INF/lib/avro-1.7.4.jar}}
> {{share/hadoop/tools/lib/avro-1.7.4.jar}}
> {{share/hadoop/common/lib/avro-1.7.4.jar}}
> {{hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/avro-1.7.4.jar}}
>  
> Issue filed downstream in
> [https://github.com/bigdatagenomics/adam/issues/2151]
>  
> Attached a smaller reproducing test case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to