Re: [PR] [SPARK-46378][SQL][FOLLOWUP] Do not rely on TreeNodeTag in Project [spark]

via GitHub Wed, 20 Dec 2023 19:01:47 -0800


viirya commented on code in PR #44429:
URL: https://github.com/apache/spark/pull/44429#discussion_r1433400585



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala:
##########
@@ -1602,12 +1595,6 @@ object EliminateSorts extends Rule[LogicalPlan] {
     plan match {
       case Sort(_, global, child) if canRemoveGlobalSort || !global =>
         recursiveRemoveSort(child, canRemoveGlobalSort)
-      case Sort(sortOrder, true, child) =>
-        // For this case, the upper sort is local so the ordering of present 
sort is unnecessary,
-        // so here we only preserve its output partitioning using 
`RepartitionByExpression`.
-        // We should use `None` as the optNumPartitions so AQE can coalesce 
shuffle partitions.
-        // This behavior is same with original global sort.
-        RepartitionByExpression(sortOrder, recursiveRemoveSort(child, true), 
None)

Review Comment:
   Hmm, previously this rule looks into this global Sort's child to remove 
local and global Sort recursively without condition. But in the new 
`RemoveRedundantSorts` rule:
   
   ```scala
   case s @ Sort(orders, true, child) =>
     val newChild = recursiveRemoveSort(child, optimizeGlobalSort = false)
   ```
   `recursiveRemoveSort` in `RemoveRedundantSorts` only removes local Sort if 
its child is already sorted. Do we miss this optimization?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-46378][SQL][FOLLOWUP] Do not rely on TreeNodeTag in Project [spark]

Reply via email to