[
https://issues.apache.org/jira/browse/PARQUET-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509500#comment-17509500
]
Xinli Shang commented on PARQUET-1595:
--------------------------------------
Is it a typo for Int32Value -> int64?
> Parquet proto writer de-nest Protobuf wrapper classes
> -----------------------------------------------------
>
> Key: PARQUET-1595
> URL: https://issues.apache.org/jira/browse/PARQUET-1595
> Project: Parquet
> Issue Type: Improvement
> Components: parquet-mr
> Reporter: Ying Xu
> Priority: Major
>
> Existing Parquet protobuf writer support preserves the structure of any
> Protobuf Message objects. This works well in most cases. However, when
> dealing with [Protobuf wrapper
> messages|https://github.com/protocolbuffers/protobuf/blob/master/src/google/protobuf/wrappers.proto],
> users may prefer directly writing the de-nested value into the Parquet
> files, for ease of querying them directly (in query engine such as
> Hive/Presto).
> Proposal:
> * Implement a control flag, e.g., enableDenestingWrappers, to control
> whether or not to denest Protobuf wrapper classes.
> * When this flag is set to true, write the Protobuf wrapper classes as
> single primitive fields, based on the type of the wrapped *value* field.
>
> ||Protobuf Type||Parquet Type||
> |BoolValue|boolean|
> |BytesValue|binary|
> |DoubleValue|double|
> |FloatValue|float|
> |Int32Value|int64 (32-bit, signed)|
> |Int64Value|int64 (64-bit, signed)|
> |StringValue|binary (string)|
> |UInt32Value|int64 (32-bit, unsigned)|
> |UInt64Value|int64 (64-bit, unsigned)|
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)