[
https://issues.apache.org/jira/browse/BEAM-8953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anonymous updated BEAM-8953:
----------------------------
Status: Triage Needed (was: Resolved)
> Extend ParquetIO.Read/ReadFiles.Builder to support Avro GenericData model
> -------------------------------------------------------------------------
>
> Key: BEAM-8953
> URL: https://issues.apache.org/jira/browse/BEAM-8953
> Project: Beam
> Issue Type: Improvement
> Components: io-java-parquet
> Reporter: Ryan Berti
> Assignee: Ryan Berti
> Priority: P3
> Fix For: 2.19.0
>
> Time Spent: 6h 20m
> Remaining Estimate: 0h
>
> When utilizing ParquetIO to deserialize objects into case classes in Scala,
> we'd like to utilize a downstream converter which takes GenericRecords and
> converts them to instances of our case classes, rather than relying on
> ParquetIO to deserialize into the case class via reflection + implementing
> the IndexedRecord interface.
> The ParquetIO.Read / ParquetIO.ReadFiles Builders currently support a
> filepattern + schema / schema arguments respectively. When using the Read /
> ReadFiles Builders with these arguments, the underlying AvroParquetReader
> object that gets created in the ParquetIO.ReadFiles.ReadFn method defaults to
> utilizing an AvroReadSupport instance whose GenericData model gets set to
> SpecificData. We'd like to have the the underlying AvroReadSupport utilize
> the GenericData model, but there's currently no way to force this to happen
> via the existing ParquetIO Read / ReadFiles builders.
> I'd like to extend the ParquetIO Read / ReadFiles builders to support a new
> method allowing users to define a GenericData model, which will then be
> passed into the AvroParquetReader builder. I've tested and validated that
> this method allows ParquetIO to generate GenericRecord instances without
> requiring that the users classes can be reflectively instantiated and
> initialized via the IndexedRecord interface.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)