maropu commented on a change in pull request #32659:
URL: https://github.com/apache/spark/pull/32659#discussion_r647121080
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/EstimationUtils.scala
##########
@@ -80,6 +80,54 @@ object EstimationUtils {
expressions.collect {
case alias @ Alias(attr: Attribute, _) if attributeStats.contains(attr)
=>
alias.toAttribute -> attributeStats(attr)
+ case alias @ Alias(expn: Expression, _) if isExpressionStatsExist(expn,
attributeStats) =>
+ getExpressionStats(alias.toAttribute, expn, attributeStats)
+ }
+ }
+
+ // Support for substring expressions.
+ // TODO: Support for more expressions like Multiply.
+ private def isExpressionStatsExist(
+ expn: Expression,
Review comment:
`expn` -> `expr`
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/EstimationUtils.scala
##########
@@ -80,6 +80,54 @@ object EstimationUtils {
expressions.collect {
case alias @ Alias(attr: Attribute, _) if attributeStats.contains(attr)
=>
alias.toAttribute -> attributeStats(attr)
+ case alias @ Alias(expn: Expression, _) if isExpressionStatsExist(expn,
attributeStats) =>
+ getExpressionStats(alias.toAttribute, expn, attributeStats)
+ }
+ }
+
+ // Support for substring expressions.
+ // TODO: Support for more expressions like Multiply.
Review comment:
Why do we need to handle individual exprs here? For aggregate stat
estimation, we cannot just use upper-bould stat values from a child plan in
`AggregateEstimation`?
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/EstimationUtils.scala
##########
@@ -80,6 +80,54 @@ object EstimationUtils {
expressions.collect {
case alias @ Alias(attr: Attribute, _) if attributeStats.contains(attr)
=>
alias.toAttribute -> attributeStats(attr)
+ case alias @ Alias(expn: Expression, _) if isExpressionStatsExist(expn,
attributeStats) =>
Review comment:
Why did you update this method instead of `AggregateEstimation`?
`Project` uses this method though. Is this related to projections?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]