[
https://issues.apache.org/jira/browse/PARQUET-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fokko Driesprong resolved PARQUET-1778.
---------------------------------------
Fix Version/s: 1.12.0
Resolution: Fixed
> Do Not Consider Class for Avro Generic Record Reader
> ----------------------------------------------------
>
> Key: PARQUET-1778
> URL: https://issues.apache.org/jira/browse/PARQUET-1778
> Project: Parquet
> Issue Type: Improvement
> Reporter: David Mollitor
> Assignee: David Mollitor
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.12.0
>
>
>
> {code:java|title=Example Code}
> final ParquetReader<GenericRecord> reader =
> AvroParquetReader.<GenericRecord>builder(path).build();
> final GenericRecord genericRecord = reader.read();
> {code}
> It fails with...
> {code:none}
> java.lang.NoSuchMethodException: io.github.belugabehr.app.Record.<init>()
> at java.lang.Class.getConstructor0(Class.java:3082) ~[na:1.8.0_232]
> at java.lang.Class.getDeclaredConstructor(Class.java:2178)
> ~[na:1.8.0_232]
> at
> org.apache.avro.specific.SpecificData$1.computeValue(SpecificData.java:63)
> ~[avro-1.9.1.jar:1.9.1]
> at
> org.apache.avro.specific.SpecificData$1.computeValue(SpecificData.java:58)
> ~[avro-1.9.1.jar:1.9.1]
> at java.lang.ClassValue.getFromHashMap(ClassValue.java:227)
> ~[na:1.8.0_232]
> at java.lang.ClassValue.getFromBackup(ClassValue.java:209)
> ~[na:1.8.0_232]
> at java.lang.ClassValue.get(ClassValue.java:115) ~[na:1.8.0_232]
> at
> org.apache.avro.specific.SpecificData.newInstance(SpecificData.java:470)
> ~[avro-1.9.1.jar:1.9.1]
> at
> org.apache.avro.specific.SpecificData.newRecord(SpecificData.java:491)
> ~[avro-1.9.1.jar:1.9.1]
> at
> org.apache.parquet.avro.AvroRecordConverter.start(AvroRecordConverter.java:404)
> ~[parquet-avro-1.11.0.jar:1.11.0]
> at
> org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:392)
> ~[parquet-column-1.11.0.jar:1.11.0]
> at
> org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:226)
> ~[parquet-hadoop-1.11.0.jar:1.11.0]
> at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:132)
> ~[parquet-hadoop-1.11.0.jar:1.11.0]
> at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136)
> ~[parquet-hadoop-1.11.0.jar:1.11.0]
> {code}
> I was surprised because it should just load a {{GenericRecord}} view of the
> data. But alas, I have the Avro Schema defined with the {{namespace}} and
> {{name}} fields pointing to {{io.github.belugabehr.app.Record}} which just so
> happens to be a real class on the class path, so it is trying to call the
> public constructor on the class and this constructor does does not exist.
> Regardless, the {{GenericRecordReader}} should just ignore this Avro Schema
> namespace information.
> I am putting {{GenericRecords}} into the Parquet file, I expect to get
> {{GenericRecords}} back out when I read it.
> If I hack the information in a Schema and change the {{namespace}} or
> {{name}} fields to something bogus, it works as I would expect it to. It
> successfully reads and returns a {{GenericRecord}}.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)