Hi Radu, This appears to be a bug, would you mind filing a bug in JIRA? I'm looking into it to see if I can figure out what is going on.
Thanks, Micah On Wed, Jul 29, 2020 at 1:07 PM Radu Teodorescu <radukay...@yahoo.com.invalid> wrote: > Is the current version supposed to allow struct columns with null values > to be written to parquet: > > I narrowed it down to a two rows table with one column and two rows and > the resulting parquet file is broken both according to parquet-tools as > well as our own reader (it looks like a buffer is not written in full, but > I haven’t dug much deeper) > > This is the table: > > struct: struct<int: int64> > child 0, int: int64 > ---- > struct: > [ > -- is_valid: > [ > false, > true > ] > -- child 0 type: int64 > [ > null, > 2 > ] > ] > > and this is my repro table generation: > > std::shared_ptr<arrow::Table> generate_table2() { > auto i64builder = std::make_shared<arrow::Int64Builder>(); > const std::shared_ptr<arrow::DataType> structType = > arrow::struct_({arrow::field("int", arrow::int64())}); > arrow::StructBuilder structBuilder(structType, > arrow::default_memory_pool(), { > std::static_pointer_cast<arrow::ArrayBuilder>(i64builder)}); > PARQUET_THROW_NOT_OK(structBuilder.AppendNull()); > PARQUET_THROW_NOT_OK(structBuilder.Append()); > PARQUET_THROW_NOT_OK(i64builder->Append(2)); > std::shared_ptr<arrow::Array> structArray; > PARQUET_THROW_NOT_OK(structBuilder.Finish(&structArray)); > std::shared_ptr<arrow::Schema> schema = > arrow::schema({arrow::field("struct",structType)}); > return arrow::Table::Make(schema, {structArray}); > } > Is this a bug, know limitation or am I doing something dumb? > > Thank you > Radu > >