AngersZhuuuu commented on a change in pull request #31485:
URL: https://github.com/apache/spark/pull/31485#discussion_r571992802
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlanVisitor.scala
##########
@@ -47,6 +49,16 @@ trait LogicalPlanVisitor[T] {
def default(p: LogicalPlan): T
+ def visitSubqueryExpression(p: LogicalPlan): LogicalPlan = {
+ p.transformExpressionsDown {
+ case subqueryExpression: SubqueryExpression =>
+ // trigger subquery's child plan stats propagation
Review comment:
> This is weird. Doesn't EXPLAIN trigger the plan stats propagation?
Yea, EXPLAIN trigger plan stats before build string:
```
private def stringWithStats(maxFields: Int, append: String => Unit): Unit
= {
val maxFields = SQLConf.get.maxToStringFields
// trigger to compute stats for logical plans
try {
optimizedPlan.stats
} catch {
case e: AnalysisException => append(e.toString + "\n")
}
// only show optimized logical plan and physical plan
append("== Optimized Logical Plan ==\n")
QueryPlan.append(optimizedPlan, append, verbose = true, addSuffix =
true, maxFields)
append("\n== Physical Plan ==\n")
QueryPlan.append(executedPlan, append, verbose = true, addSuffix =
false, maxFields)
append("\n")
}
```
Have tested that in subquery, all behavior about statistic is right since
when it use statistcs,it will call `plan.stats`.
We do it here just trigger it earlier.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]