huaxingao commented on a change in pull request #32049: URL: https://github.com/apache/spark/pull/32049#discussion_r634877439
########## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsPushDownAggregates.java ########## @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.connector.read; + +import org.apache.spark.annotation.Evolving; +import org.apache.spark.sql.sources.Aggregation; +import org.apache.spark.sql.types.StructType; + +/** + * A mix-in interface for {@link ScanBuilder}. Data source can implement this interface to + * push down aggregates to the data source. + * + * @since 3.2.0 + */ +@Evolving +public interface SupportsPushDownAggregates extends ScanBuilder { + + /** + * Pushes down Aggregation to datasource. + * The Aggregation can be pushed down only if all the Aggregate Functions can + * be pushed down. + */ + void pushAggregation(Aggregation aggregation); + + /** + * Returns the aggregation that are pushed to the data source via + * {@link #pushAggregation(Aggregation aggregation)}. + */ + Aggregation pushedAggregation(); + + /** + * Returns the schema of the pushed down aggregates + */ + StructType getPushDownAggSchema(); + + /** + * Indicate if the data source only supports global aggregated push down + */ + boolean supportsGlobalAggregatePushDownOnly(); + + /** + * Indicate if the data source supports push down aggregates along with filters Review comment: I mean if we can push down aggregate and filter together. For example, `SELECT Max(c1) FROM t WHERE c2 >1`, we can push down both aggregate and filter for JDBC. But I am not sure about Parquet and ORC. If the filter doesn't affect the footer's Max/Min/Count value, we can push down both aggregate and filter, otherwise, we can't. I am not sure how to check if the filter affects the footer's Max/Main/Count value or not. Currently I only push down Max/Min/Count if no filter. If filter is present, I only push down filter, not Max/Min/Count. ########## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/sources/aggregates.scala ########## @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.sources Review comment: Yes. Will change -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
