jchen5 commented on code in PR #42383:
URL: https://github.com/apache/spark/pull/42383#discussion_r1300507642
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/DecorrelateInnerQuery.scala:
##########
@@ -654,6 +654,25 @@ object DecorrelateInnerQuery extends PredicateHelper {
val newProject = Project(newProjectList ++ referencesToAdd,
newChild)
(newProject, joinCond, outerReferenceMap)
+ case w@Window(projectList, partitionSpec, orderSpec, child) =>
+ val outerReferences = collectOuterReferences(w.expressions)
+ assert(outerReferences.isEmpty, s"Correlated column is not allowed
in window " +
+ s"function: $w")
+ val newOuterReferences = parentOuterReferences ++ outerReferences
+ val (newChild, joinCond, outerReferenceMap) =
+ decorrelate(child, newOuterReferences, aggregated = true,
underSetOp)
Review Comment:
Yeah, I agree we need the logic of setting aggregated=true and that's fine.
It's needed for correctness around outer ref substitution. And window funcs are
doing something similar to aggregation, so I think it's ok as long as there's a
comment.
You're right that this test case wouldn't work, how about the same thing but
with a lateral join?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]