Hello, I too have been poking around the Parquet-Proto package as well.
I would expect "bar_int" and "bar_int2" to be 'null' here. Have you filed a JIRA with this reproduction? Thanks. On Fri, Sep 25, 2020 at 9:58 AM Aaron Niskode-Dossett <[email protected]> wrote: > Hello, > > I am experimenting with serializing protobuf3 to parquet and have a > question about how "oneOf" fields should be treated. I will describe an > example. I'm running parquet 1.11.1 with PARQUET-1684 applied. That JIRA > is about how default values are written out, and seems related to my > question. > > SCHEMA > -------- > message Person { > int32 foo = 1; > oneof optional_bar { > int32 bar_int = 200; > int32 bar_int2 = 201; > string bar_string = 300; > } > } > > CODE > -------- > I set values for foo and bar_string > > for (int i = 0; i < 3; i += 1) { > com.etsy.grpcparquet.Person message = Person.newBuilder() > .setFoo(i) > .setBarString("hello world") > .build(); > message.writeDelimitedTo(out); > } > And then I write the protobuf file out to parquet. > > RESULT > ----------- > $ parquet-tools show example.parquet > > > +-------+-----------+------------+--------------+ > | foo | bar_int | bar_int2 | bar_string | > |-------+-----------+------------+--------------| > | 0 | 0 | 0 | hello world | > | 1 | 0 | 0 | hello world | > | 2 | 0 | 0 | hello world | > +-------+-----------+------------+--------------+ > > I would expect that bar_int and bar_int2 are EMPTY for all three rows since > only bar_string is set in the oneof. > > Is this the right expectation for me to have? > > Thank you! > > -- > Aaron Niskode-Dossett, Data Engineering -- Etsy >
