[ 
https://issues.apache.org/jira/browse/ARROW-12117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17313276#comment-17313276
 ] 

Antoine Pitrou commented on ARROW-12117:
----------------------------------------

I did a temporary patch to {{parquet-reader}} :-)
{code}
diff --git a/cpp/src/parquet/schema.cc b/cpp/src/parquet/schema.cc
index bfb295f0b..9ef982e33 100644
--- a/cpp/src/parquet/schema.cc
+++ b/cpp/src/parquet/schema.cc
@@ -413,6 +413,7 @@ std::unique_ptr<Node> GroupNode::FromParquet(const void* 
opaque_element,
   if (element->__isset.field_id) {
     field_id = element->field_id;
   }
+  ARROW_LOG(INFO) << "GroupNode '" << element->name << "': repetition = " << 
LoadEnumSafe(&element->repetition_type);
 
   std::unique_ptr<GroupNode> group_node;
   if (element->__isset.logicalType) {
@@ -439,6 +440,7 @@ std::unique_ptr<Node> PrimitiveNode::FromParquet(const 
void* opaque_element,
   if (element->__isset.field_id) {
     field_id = element->field_id;
   }
+  ARROW_LOG(INFO) << "PrimitiveNode '" << element->name << "': repetition = " 
<< LoadEnumSafe(&element->repetition_type);
 
   std::unique_ptr<PrimitiveNode> primitive_node;
   if (element->__isset.logicalType) {
{code}


> [C++][Parquet] Root message of parquet may contain repetition
> -------------------------------------------------------------
>
>                 Key: ARROW-12117
>                 URL: https://issues.apache.org/jira/browse/ARROW-12117
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>            Reporter: Jorge Leitão
>            Priority: Minor
>              Labels: parquet
>
> According to the parquet format, [the root message does not contain a 
> repetition|https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L363].
>  However, it seems that cpp implementation is writing one.
> I noticed this while going through the Rust parquet reader, that has some 
> comments in this direction: E.g.
> [https://github.com/apache/arrow/blob/5be69789eeac0f2c357cfcd0d329c518848adebc/rust/parquet/src/schema/types.rs#L1091]
> and
> [https://github.com/apache/arrow/blob/5be69789eeac0f2c357cfcd0d329c518848adebc/rust/parquet/src/schema/types.rs#L2059]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to