Neville Li created BEAM-7925:
--------------------------------

             Summary: ParquetIO supports neither column projection nor filter 
predicate
                 Key: BEAM-7925
                 URL: https://issues.apache.org/jira/browse/BEAM-7925
             Project: Beam
          Issue Type: Improvement
          Components: io-java-parquet
    Affects Versions: 2.14.0
            Reporter: Neville Li


Current `ParquetIO` supports neither column projection nor filter predicate 
which defeats the performance motivation of using Parquet in the first place. 
That's why we have our own implementation of 
[ParquetIO|https://github.com/spotify/scio/tree/master/scio-parquet/src] in 
Scio.

Reading Parquet as Avro with column projection has some complications, namely, 
the resulting Avro records may be incomplete and will not survive ser/de. A 
workaround maybe provide a {{TypedRead}} interface that takes a {{Function<A, 
B>}} that maps invalid Avro {{A}} into user defined type {{B}}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to