[
https://issues.apache.org/jira/browse/PARQUET-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gabor Szadovszky reassigned PARQUET-1917:
-----------------------------------------
Assignee: Aaron Blake Niskode-Dossett
> [parquet-proto] default values are stored in oneOf fields that aren't set
> -------------------------------------------------------------------------
>
> Key: PARQUET-1917
> URL: https://issues.apache.org/jira/browse/PARQUET-1917
> Project: Parquet
> Issue Type: Bug
> Components: parquet-protobuf
> Affects Versions: 1.12.0
> Reporter: Aaron Blake Niskode-Dossett
> Assignee: Aaron Blake Niskode-Dossett
> Priority: Major
>
> SCHEMA
> --------
> {noformat}
> message Person {
> int32 foo = 1;
> oneof optional_bar {
> int32 bar_int = 200;
> int32 bar_int2 = 201;
> string bar_string = 300;
> }
> }{noformat}
>
> CODE
> --------
> I set values for foo and bar_string
>
> {noformat}
> for (int i = 0; i < 3; i += 1) {
> com.etsy.grpcparquet.Person message = Person.newBuilder()
> .setFoo(i)
> .setBarString("hello world")
> .build();
> message.writeDelimitedTo(out);
> }{noformat}
> And then I write the protobuf file out to parquet.
>
> RESULT
> -----------
> {noformat}
> $ parquet-tools show example.parquet
>
> +-------+-----------+------------+--------------+
> | foo | bar_int | bar_int2 | bar_string |
> |-------+-----------+------------+--------------|
> | 0 | 0 | 0 | hello world |
> | 1 | 0 | 0 | hello world |
> | 2 | 0 | 0 | hello world |
> +-------+-----------+------------+--------------+{noformat}
>
> bar_int and bar_int2 should be EMPTY for all three rows since only bar_string
> is set in the oneof. 0 is the default value for int, but it should not be
> stored.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)