cloud-fan commented on a change in pull request #34904:
URL: https://github.com/apache/spark/pull/34904#discussion_r773131123
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala
##########
@@ -189,6 +204,13 @@ object V2ScanRelationPushDown extends Rule[LogicalPlan]
with PredicateHelper {
}
}
+ private def newAggOutput(aggAttribute: AttributeReference, agg:
AggregateExpression) =
+ if (aggAttribute.dataType == agg.resultAttribute.dataType) {
+ aggAttribute
+ } else {
+ Cast(aggAttribute, agg.resultAttribute.dataType)
Review comment:
I think complete and partial pushdown are different here.
For complete pushdown, we should cast to the data type of the aggregate
function.
For partial pushdown, Spark will run aggregate again, so we should cast to
the data type of the input of the aggregate function, so that the final data
type is still the same as before.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]