[
https://issues.apache.org/jira/browse/SPARK-19697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Heuer updated SPARK-19697:
----------------------------------
Description:
In a downstream project (https://github.com/bigdatagenomics/adam), adding a
dependency on `parquet-avro` version 1.8.2 results in `NoSuchMethodException`s
at runtime on various Spark versions, including 2.1.0.
pom.xml:
{code:xml}
<properties>
<java.version>1.8</java.version>
<avro.version>1.8.1</avro.version>
<scala.version>2.11.8</scala.version>
<scala.version.prefix>2.11</scala.version.prefix>
<spark.version>2.1.0</spark.version>
<parquet.version>1.8.2</parquet.version>
<!-- ... -->
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.apache.parquet</groupId>
<artifactId>parquet-avro</artifactId>
<version>${parquet.version}</version>
</dependency>
{code}
Example using `spark-submit` (called via `adam-submit` below)
{code}
$ ./bin/adam-submit vcf2adam \
adam-core/src/test/resources/small.vcf \
small.adam
...
java.lang.NoSuchMethodError:
org.apache.avro.Schema.getLogicalType()Lorg/apache/avro/LogicalType;
at
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:178)
at
org.apache.parquet.avro.AvroSchemaConverter.convertUnion(AvroSchemaConverter.java:214)
at
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:171)
at
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:130)
at
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:227)
at
org.apache.parquet.avro.AvroSchemaConverter.convertFields(AvroSchemaConverter.java:124)
at
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:152)
at
org.apache.parquet.avro.AvroSchemaConverter.convertUnion(AvroSchemaConverter.java:214)
at
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:171)
at
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:130)
at
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:227)
at
org.apache.parquet.avro.AvroSchemaConverter.convertFields(AvroSchemaConverter.java:124)
at
org.apache.parquet.avro.AvroSchemaConverter.convert(AvroSchemaConverter.java:115)
at org.apache.parquet.avro.AvroWriteSupport.init(AvroWriteSupport.java:117)
at
org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:311)
at
org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:283)
at
org.apache.spark.rdd.InstrumentedOutputFormat.getRecordWriter(InstrumentedOutputFormat.scala:35)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1119)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1102)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}
The issue can be reproduced from this pull request
https://github.com/bigdatagenomics/adam/pull/1360
and is reported as Jenkins CI test failures
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1810
was:
In a downstream project (https://github.com/bigdatagenomics/adam), adding a
dependency on `parquet-avro` version 1.8.2 results in `NoSuchMethodException`s
at runtime on various Spark versions, including 2.1.0.
pom.xml:
{code:xml}
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.apache.parquet</groupId>
<artifactId>parquet-avro</artifactId>
<version>${parquet.version}</version>
</dependency>
<dependency>
<groupId>org.apache.parquet</groupId>
<!-- This library has no Scala 2.11 version, but using the 2.10 version
seems to work. -->
<artifactId>parquet-scala_2.10</artifactId>
<version>${parquet.version}</version>
<exclusions>
<exclusion>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
</exclusion>
</exclusions>
</dependency>
{code}
Example using `spark-submit` (called via `adam-submit` below)
{code}
$ ./bin/adam-submit vcf2adam \
adam-core/src/test/resources/small.vcf \
small.adam
...
java.lang.NoSuchMethodError:
org.apache.avro.Schema.getLogicalType()Lorg/apache/avro/LogicalType;
at
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:178)
at
org.apache.parquet.avro.AvroSchemaConverter.convertUnion(AvroSchemaConverter.java:214)
at
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:171)
at
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:130)
at
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:227)
at
org.apache.parquet.avro.AvroSchemaConverter.convertFields(AvroSchemaConverter.java:124)
at
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:152)
at
org.apache.parquet.avro.AvroSchemaConverter.convertUnion(AvroSchemaConverter.java:214)
at
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:171)
at
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:130)
at
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:227)
at
org.apache.parquet.avro.AvroSchemaConverter.convertFields(AvroSchemaConverter.java:124)
at
org.apache.parquet.avro.AvroSchemaConverter.convert(AvroSchemaConverter.java:115)
at
org.apache.parquet.avro.AvroWriteSupport.init(AvroWriteSupport.java:117)
at
org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:311)
at
org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:283)
at
org.apache.spark.rdd.InstrumentedOutputFormat.getRecordWriter(InstrumentedOutputFormat.scala:35)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1119)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1102)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}
The issue can be reproduced from this pull request
https://github.com/bigdatagenomics/adam/pull/1360
and is reported as Jenkins CI test failures
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1810
> NoSuchMethodError: org.apache.avro.Schema.getLogicalType()
> ----------------------------------------------------------
>
> Key: SPARK-19697
> URL: https://issues.apache.org/jira/browse/SPARK-19697
> Project: Spark
> Issue Type: Bug
> Components: Build, Spark Core
> Affects Versions: 2.1.0
> Environment: Apache Spark 2.1.0, Scala version 2.11.8, Java
> HotSpot(TM) 64-Bit Server VM, 1.8.0_60
> Reporter: Michael Heuer
>
> In a downstream project (https://github.com/bigdatagenomics/adam), adding a
> dependency on `parquet-avro` version 1.8.2 results in
> `NoSuchMethodException`s at runtime on various Spark versions, including
> 2.1.0.
> pom.xml:
> {code:xml}
> <properties>
> <java.version>1.8</java.version>
> <avro.version>1.8.1</avro.version>
> <scala.version>2.11.8</scala.version>
> <scala.version.prefix>2.11</scala.version.prefix>
> <spark.version>2.1.0</spark.version>
> <parquet.version>1.8.2</parquet.version>
> <!-- ... -->
> <dependencyManagement>
> <dependencies>
> <dependency>
> <groupId>org.apache.parquet</groupId>
> <artifactId>parquet-avro</artifactId>
> <version>${parquet.version}</version>
> </dependency>
> {code}
> Example using `spark-submit` (called via `adam-submit` below)
> {code}
> $ ./bin/adam-submit vcf2adam \
> adam-core/src/test/resources/small.vcf \
> small.adam
> ...
> java.lang.NoSuchMethodError:
> org.apache.avro.Schema.getLogicalType()Lorg/apache/avro/LogicalType;
> at
> org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:178)
> at
> org.apache.parquet.avro.AvroSchemaConverter.convertUnion(AvroSchemaConverter.java:214)
> at
> org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:171)
> at
> org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:130)
> at
> org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:227)
> at
> org.apache.parquet.avro.AvroSchemaConverter.convertFields(AvroSchemaConverter.java:124)
> at
> org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:152)
> at
> org.apache.parquet.avro.AvroSchemaConverter.convertUnion(AvroSchemaConverter.java:214)
> at
> org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:171)
> at
> org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:130)
> at
> org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:227)
> at
> org.apache.parquet.avro.AvroSchemaConverter.convertFields(AvroSchemaConverter.java:124)
> at
> org.apache.parquet.avro.AvroSchemaConverter.convert(AvroSchemaConverter.java:115)
> at org.apache.parquet.avro.AvroWriteSupport.init(AvroWriteSupport.java:117)
> at
> org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:311)
> at
> org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:283)
> at
> org.apache.spark.rdd.InstrumentedOutputFormat.getRecordWriter(InstrumentedOutputFormat.scala:35)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1119)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1102)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
> at org.apache.spark.scheduler.Task.run(Task.scala:99)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> The issue can be reproduced from this pull request
> https://github.com/bigdatagenomics/adam/pull/1360
> and is reported as Jenkins CI test failures
> https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1810
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]