peter-toth commented on code in PR #43614:
URL: https://github.com/apache/spark/pull/43614#discussion_r1390209748
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala:
##########
@@ -156,7 +171,15 @@ trait CheckAnalysis extends PredicateHelper with
LookupCatalog with QueryErrorsB
// If a CTE relation is never used, it will disappear after inline. Here
we explicitly check
// analysis for it, to make sure the entire query plan is valid.
try {
- if (refCount == 0) checkAnalysis0(relation.child)
Review Comment:
Sorry, for the late reply. (I was travelling this week.)
The reason why I would suggest the `cleanCTEMap()` approach is because you
might be checking a CTE 2 times now. Not in this new code, as `visited` map
makes sure to check a CTE only once here, but later when we check the inlined
plan it might contain a CTE that was referenced from an other CTE with ref
count = 0 (so the CTE is checked in the new code) and also from a third CTE
with valid references (so the CTE is checked in the inlined plan again).
As for checking the leafs first, checking the CTEs in id order guarantees
that. This is because how we do CTE resolution (assign ids to defs) in
`CTESubstitution`. `cleanCTEMap()` also uses this property to quickly clear
useless transitive reference counts.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]