Hi devs,
I am trying to create a parquet file that contains an array on int32 for
each record.
The schema I am trying to implement is as follows:
required arr_schema {
required int32 id;
required group my_array (LIST) {
repeated group list {
optional int32 element;
}
}
}
I guess I have to create GroupNodes and assigned to them the inner
elements. Something like the code snipped above. But then for writting? how
can I accomplish this?
fields.push_back(PrimitiveNode::Make("int32_field", Repetition::REPEATED,
Type::INT32, LogicalType::NONE));
auto list_field = GroupNode::Make("some_array", Repetition::REQUIRED,
fields);
I also saw the logical type LIST defined in
https://github.com/apache/parquet-cpp/blob/master/src/parquet/types.cc#L163,
but I don't know how to use it.
What I want at the end is to read such generated files from Amazon
Athena/Presto.
Any pointers or help are highly appreciated.
Thanks!