bkietz commented on code in PR #39065:
URL: https://github.com/apache/arrow/pull/39065#discussion_r1431549725
##########
cpp/src/arrow/dataset/file_parquet.cc:
##########
@@ -897,16 +907,25 @@ Result<std::vector<compute::Expression>>
ParquetFileFragment::TestRowGroups(
ARROW_ASSIGN_OR_RAISE(auto match, ref.FindOneOrNone(*physical_schema_));
if (match.empty()) continue;
- if (statistics_expressions_complete_[match[0]]) continue;
- statistics_expressions_complete_[match[0]] = true;
+ const SchemaField* schema_field = &manifest_->schema_fields[match[0]];
Review Comment:
Since this is the same logic as
[FieldPath::Get](https://github.com/apache/arrow/blob/42058376eed67019e6bea5715ae3740711266fd1/cpp/src/arrow/type.cc#L1785),
would you mind extracting it as a separate function? It would be nice to have
a clear single entry point for future work on nested field references in parquet
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]