rdblue commented on code in PR #5872:
URL: https://github.com/apache/iceberg/pull/5872#discussion_r990819854
##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java:
##########
@@ -145,6 +158,83 @@ public Filter[] pushedFilters() {
return pushedFilters;
}
+ @Override
+ public boolean pushAggregation(Aggregation aggregation) {
+ if (!(table instanceof BaseTable)) {
+ return false;
+ }
+ boolean aggregatePushdown =
+ Boolean.parseBoolean(
+ table
+ .properties()
+ .getOrDefault(AGGREGATE_PUSHDOWN_ENABLED,
AGGREGATE_PUSHDOWN_ENABLED_DEFAULT));
+ if (!aggregatePushdown) {
+ return false;
+ }
+
+ Snapshot currentSnapshot = table.currentSnapshot();
+ if (currentSnapshot != null) {
+ Map<String, String> map = currentSnapshot.summary();
+ // if there are row-level deletes in current snapshot, the statics
+ // maybe changed, so disable push down aggregate
+ if (Integer.parseInt(map.get("total-position-deletes")) > 0
+ || Integer.parseInt(map.get("total-equality-deletes")) > 0) {
+ return false;
+ }
+ }
+
+ // If the group by expression is not the same as the partition, the
statistics information
+ // in manifest files cannot be used to calculate min/max/count. However,
if the
+ // group by expression is not the same as the partition, the statistics
information can still
+ // be used to calculate min/max/count.
+ // Todo: enable aggregate push down for partition col group by expression
Review Comment:
I agree this can be done in a follow up PR. This is why I added
`ExpressionUtil.selectsPartitions` method. You should be able to use that to
determine whether the aggregate filter is aligned with partitioning.
Nit: should be "TODO"
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]