[
https://issues.apache.org/jira/browse/SPARK-27781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16886237#comment-16886237
]
Michael Heuer commented on SPARK-27781:
---------------------------------------
I believe I saw a fix for this specific issue, where the avro jars are now
added to the Spark binary distribution with out Hadoop. Will look for the pull
request.
I cannot let Spark off the hook as easily as you suggest though – Spark is the
project that brings these dependencies together, as compile time dependencies
and on the runtime classpath. Spark needs to ensure those dependencies are
compatible with each other.
> Tried to access method org.apache.avro.specific.SpecificData.<init>()V
> ----------------------------------------------------------------------
>
> Key: SPARK-27781
> URL: https://issues.apache.org/jira/browse/SPARK-27781
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.4.3
> Reporter: Michael Heuer
> Priority: Major
> Attachments: reproduce.sh
>
>
> It appears that there is a conflict in avro dependency versions at runtime
> when using Spark 2.4.3 and Scala 2.12
> (spark-2.4.3-bin-without-hadoop-scala-2.12 binary distribution) and Hadoop
> 2.7.7.
>
> Specifically, the Spark 2.4.3 binary distribution for Hadoop 2.7.x includes
> avro-1.8.2.jar
> {{$ find spark-2.4.3-bin-hadoop2.7 *.jar | grep avro}}
> {{jars/avro-1.8.2.jar}}
> {{jars/avro-mapred-1.8.2-hadoop2.jar}}
> {{jars/avro-ipc-1.8.2.jar}}
>
> Whereas the Spark 2.4.3 binary distribution for Scala 2.12 without Hadoop
> does not
> {{$ find spark-2.4.3-bin-without-hadoop-scala-2.12 *.jar | grep avro}}
> {{jars/avro-mapred-1.8.2-hadoop2.jar}}
>
> Including Hadoop 2.7.7 onto the classpath brings in avro-1.7.4.jar, which
> conflicts at runtime
> {{$ find hadoop-2.7.7 -name *.jar | grep avro}}
> {{share/hadoop/mapreduce/lib/avro-1.7.4.jar}}
> {{share/hadoop/kms/tomcat/webapps/kms/WEB-INF/lib/avro-1.7.4.jar}}
> {{share/hadoop/tools/lib/avro-1.7.4.jar}}
> {{share/hadoop/common/lib/avro-1.7.4.jar}}
> {{hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/avro-1.7.4.jar}}
>
> Issue filed downstream in
> [https://github.com/bigdatagenomics/adam/issues/2151]
>
> Attached a smaller reproducing test case.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]