[GitHub] [spark] agubichev commented on a diff in pull request #42778: [SPARK-45055] [SQL] Do not transpose windows if they conflict on ORDER BY / PROJECT clauses

via GitHub Mon, 04 Sep 2023 15:10:30 -0700


agubichev commented on code in PR #42778:
URL: https://github.com/apache/spark/pull/42778#discussion_r1315214615



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala:
##########
@@ -1305,11 +1305,20 @@ object TransposeWindow extends Rule[LogicalPlan] {
     }
   }
 
+  // Returns false if the parent 'w1' window function order by any of the 
window
+  // expressions produced by 'w2'.
+  private def compatibleOrderBy(w1: Window, w2: Window): Boolean = {
+    val childWindowExprs = w2.windowExpressions.map(x => x.toAttribute)
+    val parentOrder = w1.orderSpec.flatMap(x => x.references)
+    childWindowExprs.intersect(parentOrder).isEmpty
+  }
+
   private def windowsCompatible(w1: Window, w2: Window): Boolean = {
     w1.references.intersect(w2.windowOutputSet).isEmpty &&

Review Comment:
   It's a subtle bug.
   `references` by definition do not include output attributes (see 
https://github.com/apache/spark/blob/798fce3b571907ee52058004cc38c2e8dbc4b016/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala#L91).
   So if the top Window function orders by C and projects (outputs) C, then C 
is not included into references.
   Therefore, we need that extra check.
   
   (and we can't just take all attributes in the Window operator either, since 
PARTITION BY clauses do need to intersect)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] agubichev commented on a diff in pull request #42778: [SPARK-45055] [SQL] Do not transpose windows if they conflict on ORDER BY / PROJECT clauses

Reply via email to