David Mollitor created PARQUET-1778:
---------------------------------------

             Summary: Do Not Record Class for Avro Generic Record Reader
                 Key: PARQUET-1778
                 URL: https://issues.apache.org/jira/browse/PARQUET-1778
             Project: Parquet
          Issue Type: Improvement
            Reporter: David Mollitor


{code:java}

final ParquetReader<GenericRecord> reader = 
AvroParquetReader.<GenericRecord>builder(path).build();final 
ParquetReader<GenericRecord> reader = 
AvroParquetReader.<GenericRecord>builder(path).build(); final GenericRecord 
genericRecord = reader.read();
{code}

It fails with...

{code:none}
java.lang.NoSuchMethodException: io.github.belugabehr.app.Record.<init>()
        at java.lang.Class.getConstructor0(Class.java:3082) ~[na:1.8.0_232]
        at java.lang.Class.getDeclaredConstructor(Class.java:2178) 
~[na:1.8.0_232]
        at 
org.apache.avro.specific.SpecificData$1.computeValue(SpecificData.java:63) 
~[avro-1.9.1.jar:1.9.1]
        at 
org.apache.avro.specific.SpecificData$1.computeValue(SpecificData.java:58) 
~[avro-1.9.1.jar:1.9.1]
        at java.lang.ClassValue.getFromHashMap(ClassValue.java:227) 
~[na:1.8.0_232]
        at java.lang.ClassValue.getFromBackup(ClassValue.java:209) 
~[na:1.8.0_232]
        at java.lang.ClassValue.get(ClassValue.java:115) ~[na:1.8.0_232]
{code}

I was surprised because it should just load a {{GenericRecord}} view of the 
data.  But alas, I have the Avro Schema defined with the {{namespace}} and 
{{name}} fields pointing to {{io.github.belugabehr.app.Record}} which just so 
happens to be a real class on the class path, so it is trying to call the 
public constructor on the class which does not exist.

There {{GenericRecordReader}} should always ignore this Avro Schema namespace 
information.

I am putting {{GenericRecords}} into the Parquet file, I expect to get 
{{GenericRecords}} back out when I read it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to