Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21139#discussion_r183893505
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
    @@ -618,6 +619,18 @@ object CollapseRepartition extends Rule[LogicalPlan] {
       }
     }
     
    +/**
    + * Exchanged the adjacent logical window operator according to the order 
field of window.
    + */
    +object ExchangeWindowWithOrderField extends Rule[LogicalPlan] {
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case w1 @ Window(_, _, orderSpec1, w2 @ Window(_, _, orderSpec2, 
child2))
    +      if orderSpec1.size > orderSpec2.size =>
    --- End diff --
    
    I am not sure if this a good idea, because this does not consider the 
partitioning expressions or the actual ordering. You might actively make things 
worse if you do this. For example when you have three subsequent window 
operators like this:
    ```
    // Before rule
    window[partition by A order by B, C]
    :- window[partition by B order by C]
       :- window[partition by B order by D]
    
    // After rule (notice how we add another exchange here)
    window[partition by B order by C]
    :- window[partition by A order by B, C]
       :- window[partition by B order by D]
    ```
    I think this only has merit if the partitioning expression set matches, and 
the smaller ordering clause is the prefix of the larger ordering clause.
    
    Moreover you are messing with the output order, so you need to make sure 
there is a project (or something else) on top that retains the original order 
(might not be problem due to the way we currently plan Window operators).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to