Hello,

I too have been poking around the Parquet-Proto package as well.

I would expect "bar_int" and "bar_int2" to be 'null' here.

Have you filed a JIRA with this reproduction?

Thanks.

On Fri, Sep 25, 2020 at 9:58 AM Aaron Niskode-Dossett
<[email protected]> wrote:

> Hello,
>
> I am experimenting with serializing protobuf3 to parquet and have a
> question about how "oneOf" fields should be treated.  I will describe an
> example.  I'm running parquet 1.11.1 with PARQUET-1684 applied.  That JIRA
> is about how default values are written out, and seems related to my
> question.
>
> SCHEMA
> --------
> message Person {
>   int32 foo = 1;
>   oneof optional_bar {
>     int32 bar_int = 200;
>     int32 bar_int2 = 201;
>     string bar_string = 300;
>   }
> }
>
> CODE
> --------
> I set values for foo and bar_string
>
> for (int i = 0; i < 3; i += 1) {
>                 com.etsy.grpcparquet.Person message = Person.newBuilder()
>                         .setFoo(i)
>                         .setBarString("hello world")
>                         .build();
>                 message.writeDelimitedTo(out);
>             }
> And then I write the protobuf file out to parquet.
>
> RESULT
> -----------
> $ parquet-tools show example.parquet
>
>
> +-------+-----------+------------+--------------+
> |   foo |   bar_int |   bar_int2 | bar_string   |
> |-------+-----------+------------+--------------|
> |     0 |         0 |          0 | hello world  |
> |     1 |         0 |          0 | hello world  |
> |     2 |         0 |          0 | hello world  |
> +-------+-----------+------------+--------------+
>
> I would expect that bar_int and bar_int2 are EMPTY for all three rows since
> only bar_string is set in the oneof.
>
> Is this the right expectation for me to have?
>
> Thank you!
>
> --
> Aaron Niskode-Dossett, Data Engineering -- Etsy
>

Reply via email to