adriangb commented on code in PR #16014:
URL: https://github.com/apache/datafusion/pull/16014#discussion_r2094127261
##########
datafusion/datasource-parquet/src/opener.rs:
##########
@@ -111,19 +120,61 @@ impl FileOpener for ParquetOpener {
.create(projected_schema, Arc::clone(&self.table_schema));
let predicate = self.predicate.clone();
let table_schema = Arc::clone(&self.table_schema);
+ let partition_fields = self.partition_fields.clone();
let reorder_predicates = self.reorder_filters;
let pushdown_filters = self.pushdown_filters;
let coerce_int96 = self.coerce_int96;
let enable_bloom_filter = self.enable_bloom_filter;
let enable_row_group_stats_pruning =
self.enable_row_group_stats_pruning;
let limit = self.limit;
- let predicate_creation_errors = MetricBuilder::new(&self.metrics)
- .global_counter("num_predicate_creation_errors");
-
let enable_page_index = self.enable_page_index;
Ok(Box::pin(async move {
+ // Prune this file using the file level statistics.
Review Comment:
> which can be set to true if there are dynamic predicates present
the issue is: how do we know the filters are dynamic? we've hidden dynamic
filters behind `PhysicalExpr` so that the system can treat them as normal
filters. we could do _any_ filter pushdown but that doesn't seem like much of
an improvement.
I also think this pruning should be quite cheap / the record batches being
filtered are just a couple rows
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]