emkornfield commented on a change in pull request #11385:
URL: https://github.com/apache/arrow/pull/11385#discussion_r727443596
##########
File path: cpp/src/parquet/schema.cc
##########
@@ -73,6 +73,29 @@ std::shared_ptr<ColumnPath> ColumnPath::FromNode(const Node&
node) {
return std::make_shared<ColumnPath>(std::move(path));
}
+std::shared_ptr<ColumnPath> ColumnPath::ShortFromNode(const Node& node) {
+ // Build the path in reverse order as we traverse the nodes to the top
+ std::vector<std::string> rpath_;
+ const Node* cursor = &node;
+ while (cursor->parent()) {
+ if (cursor->is_group()) {
+ auto group = dynamic_cast<const GroupNode*>(cursor);
+ // If we have a parent list node, remove the names of the two direct
+ // child nodes (list.element)
+ if (group->logical_type()->is_list()) {
+ rpath_.pop_back();
+ rpath_.pop_back();
Review comment:
that depends on how the parquet file was written. I haven't checked
parquet-mr, but I believe new files should follow the 3-level encoding with
logical type (others might just have a repeated node).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]