Anant Damle created BEAM-11460:
----------------------------------

             Summary: Support reading Parquet files with unknown schema
                 Key: BEAM-11460
                 URL: https://issues.apache.org/jira/browse/BEAM-11460
             Project: Beam
          Issue Type: New Feature
          Components: io-java-parquet
            Reporter: Anant Damle
             Fix For: 2.27.0


Reading Parquet files using ParquetIO requires providing an Avro (equivalent) 
schema, Many a times its not possible to know the schema of the Parquet files.

AvroIO supports reading unknow schema files using a parse function 
(https://beam.apache.org/releases/javadoc/2.26.0/org/apache/beam/sdk/io/AvroIO.html)

{{#parseGenericRecords(SerializableFunction<GenericRecord,T>)}}

Supporting this is simple. and requires minimal changes to the ParquetIO 
surface.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to