[
https://issues.apache.org/jira/browse/ARROW-12117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17313276#comment-17313276
]
Antoine Pitrou commented on ARROW-12117:
----------------------------------------
I did a temporary patch to {{parquet-reader}} :-)
{code}
diff --git a/cpp/src/parquet/schema.cc b/cpp/src/parquet/schema.cc
index bfb295f0b..9ef982e33 100644
--- a/cpp/src/parquet/schema.cc
+++ b/cpp/src/parquet/schema.cc
@@ -413,6 +413,7 @@ std::unique_ptr<Node> GroupNode::FromParquet(const void*
opaque_element,
if (element->__isset.field_id) {
field_id = element->field_id;
}
+ ARROW_LOG(INFO) << "GroupNode '" << element->name << "': repetition = " <<
LoadEnumSafe(&element->repetition_type);
std::unique_ptr<GroupNode> group_node;
if (element->__isset.logicalType) {
@@ -439,6 +440,7 @@ std::unique_ptr<Node> PrimitiveNode::FromParquet(const
void* opaque_element,
if (element->__isset.field_id) {
field_id = element->field_id;
}
+ ARROW_LOG(INFO) << "PrimitiveNode '" << element->name << "': repetition = "
<< LoadEnumSafe(&element->repetition_type);
std::unique_ptr<PrimitiveNode> primitive_node;
if (element->__isset.logicalType) {
{code}
> [C++][Parquet] Root message of parquet may contain repetition
> -------------------------------------------------------------
>
> Key: ARROW-12117
> URL: https://issues.apache.org/jira/browse/ARROW-12117
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++
> Reporter: Jorge Leitão
> Priority: Minor
> Labels: parquet
>
> According to the parquet format, [the root message does not contain a
> repetition|https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L363].
> However, it seems that cpp implementation is writing one.
> I noticed this while going through the Rust parquet reader, that has some
> comments in this direction: E.g.
> [https://github.com/apache/arrow/blob/5be69789eeac0f2c357cfcd0d329c518848adebc/rust/parquet/src/schema/types.rs#L1091]
> and
> [https://github.com/apache/arrow/blob/5be69789eeac0f2c357cfcd0d329c518848adebc/rust/parquet/src/schema/types.rs#L2059]
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)