Michael Heuer created SPARK-19697:
-------------------------------------

             Summary: NoSuchMethodError: org.apache.avro.Schema.getLogicalType()
                 Key: SPARK-19697
                 URL: https://issues.apache.org/jira/browse/SPARK-19697
             Project: Spark
          Issue Type: Bug
          Components: Build, Spark Core
    Affects Versions: 2.1.0
         Environment: {{
$ spark-submit --version
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.1.0
      /_/
                        
Using Scala version 2.11.8, Java HotSpot(TM) 64-Bit Server VM, 1.8.0_60
Branch 
Compiled by user jenkins on 2016-12-16T02:04:48Z
Revision 
Url 
Type --help for more information.
}}


            Reporter: Michael Heuer


In a downstream project (https://github.com/bigdatagenomics/adam), adding a 
dependency on `parquet-avro` version 1.8.2 results in `NoSuchMethodException`s 
at runtime on various Spark versions, including 2.1.0.

pom.xml:
{{
  <dependencyManagement>
    <dependencies>
      <dependency>
        <groupId>org.apache.parquet</groupId>
        <artifactId>parquet-avro</artifactId>
        <version>${parquet.version}</version>
      </dependency>
      <dependency>
        <groupId>org.apache.parquet</groupId>
        <!-- This library has no Scala 2.11 version, but using the 2.10 version 
seems to work. -->
        <artifactId>parquet-scala_2.10</artifactId>
        <version>${parquet.version}</version>
        <exclusions>
          <exclusion>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-library</artifactId>
          </exclusion>
        </exclusions>
      </dependency>
}}

Example using `spark-submit` (called via `adam-submit` below)
{{
$ ./bin/adam-submit vcf2adam \
  adam-core/src/test/resources/small.vcf \
  small.adam
...
java.lang.NoSuchMethodError: 
org.apache.avro.Schema.getLogicalType()Lorg/apache/avro/LogicalType;
        at 
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:178)
        at 
org.apache.parquet.avro.AvroSchemaConverter.convertUnion(AvroSchemaConverter.java:214)
        at 
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:171)
        at 
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:130)
        at 
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:227)
        at 
org.apache.parquet.avro.AvroSchemaConverter.convertFields(AvroSchemaConverter.java:124)
        at 
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:152)
        at 
org.apache.parquet.avro.AvroSchemaConverter.convertUnion(AvroSchemaConverter.java:214)
        at 
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:171)
        at 
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:130)
        at 
org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:227)
        at 
org.apache.parquet.avro.AvroSchemaConverter.convertFields(AvroSchemaConverter.java:124)
        at 
org.apache.parquet.avro.AvroSchemaConverter.convert(AvroSchemaConverter.java:115)
        at 
org.apache.parquet.avro.AvroWriteSupport.init(AvroWriteSupport.java:117)
        at 
org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:311)
        at 
org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:283)
        at 
org.apache.spark.rdd.InstrumentedOutputFormat.getRecordWriter(InstrumentedOutputFormat.scala:35)
        at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1119)
        at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1102)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
}}

The issue can be reproduced from this pull request
https://github.com/bigdatagenomics/adam/pull/1360

and is reported as Jenkins CI test failures
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1810



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to