[ 
https://issues.apache.org/jira/browse/PARQUET-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Blake Niskode-Dossett updated PARQUET-1917:
-------------------------------------------------
    Component/s: parquet-protobuf

> [parquet-proto] default values are stored in oneOf fields that aren't set
> -------------------------------------------------------------------------
>
>                 Key: PARQUET-1917
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1917
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-protobuf
>    Affects Versions: 1.12.0
>            Reporter: Aaron Blake Niskode-Dossett
>            Priority: Major
>
> SCHEMA
> --------
> {noformat}
> message Person {
>   int32 foo = 1;
>   oneof optional_bar {
>     int32 bar_int = 200;
>     int32 bar_int2 = 201;
>     string bar_string = 300;
>   }
> }{noformat}
>  
> CODE
> --------
> I set values for foo and bar_string
>  
> {noformat}
> for (int i = 0; i < 3; i += 1) {
>                 com.etsy.grpcparquet.Person message = Person.newBuilder()
>                         .setFoo(i)
>                         .setBarString("hello world")
>                         .build();
>                 message.writeDelimitedTo(out);
>             }{noformat}
> And then I write the protobuf file out to parquet.
>  
> RESULT
> -----------
> {noformat}
> $ parquet-tools show example.parquet                                          
>                                                                               
> +-------+-----------+------------+--------------+
> |   foo |   bar_int |   bar_int2 | bar_string   |
> |-------+-----------+------------+--------------|
> |     0 |         0 |          0 | hello world  |
> |     1 |         0 |          0 | hello world  |
> |     2 |         0 |          0 | hello world  |
> +-------+-----------+------------+--------------+{noformat}
>  
> bar_int and bar_int2 should be EMPTY for all three rows since only bar_string 
> is set in the oneof.  0 is the default value for int, but it should not be 
> stored.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to