Ted-Jiang commented on code in PR #5268:
URL: https://github.com/apache/arrow-datafusion/pull/5268#discussion_r1105333170
##########
datafusion/core/src/physical_optimizer/pruning.rs:
##########
@@ -233,25 +233,18 @@ impl PruningPredicate {
.unwrap_or_default()
}
- /// Returns all need column indexes to evaluate this pruning predicate
- pub(crate) fn need_input_columns_ids(&self) -> HashSet<usize> {
- let mut set = HashSet::new();
- self.required_columns.columns.iter().for_each(|x| {
- match self.schema().column_with_name(x.0.name.as_str()) {
- None => {}
- Some(y) => {
- set.insert(y.0);
- }
- }
- });
- set
+ pub(crate) fn required_columns(&self) -> &RequiredStatColumns {
Review Comment:
Thanks for explanation! 👍
> If an individual parquet file does not have all the columns or has the
columns in a different order
I have a question about if `file_a (c1, c2), file_b(c1, c3)`, do df support
create external table t(c1) on both file_a and file_b 🤔
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]