Ismaël Mejía created BEAM-11908:
-----------------------------------
Summary: Deprecate .withProjection from ParquetIO
Key: BEAM-11908
URL: https://issues.apache.org/jira/browse/BEAM-11908
Project: Beam
Issue Type: Improvement
Components: io-java-parquet
Reporter: Ismaël Mejía
Assignee: Ismaël Mejía
There are multiple issues wrong with the API of withProjection:
1. The current API requires an extra encoderSchema that is not needed when
projecting data in Parquet. The simplest way to get this with the Parquet API
is by passing the projectionSchema like this:
{quote}{color:#000000}AvroReadSupport{color}.setAvroReadSchema({color:#871094}conf{color},
{color:#871094}projectionSchema{color});
{color:#000000}AvroReadSupport{color}.setRequestedProjection({color:#871094}conf{color},
{color:#871094}projectionSchema{color});
{quote}
We can offer an alternative method `withProjection(Configuration conf,
List<String> fields)` so users don't have to build their own projection Schema,
but historically we have let users to rely on the upstream connector API. If we
follow this we can better document in ParquetIO how to project fields by
relying in the Parquet APIs and avoid maintaining this extra code in the Beam
side.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)