rednaxelafx commented on a change in pull request #23303: [SPARK-26352][SQL] 
ReorderJoin should not change the order of columns
URL: https://github.com/apache/spark/pull/23303#discussion_r241366694
 
 

 ##########
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
 ##########
 @@ -48,8 +48,18 @@ object CostBasedJoinReorder extends Rule[LogicalPlan] with 
PredicateHelper {
           if projectList.forall(_.isInstanceOf[Attribute]) =>
           reorder(p, p.output)
       }
-      // After reordering is finished, convert OrderedJoin back to Join
+
+      // Cleanups
       result transformDown {
+        // if a Project was created to keep output attribute order after join 
reordering, but
 
 Review comment:
   This isn't really an improvement. It's here mainly to help pass existing 
tests -- I'm adding new projections to fix the output attribute order problem, 
but these extra projections in the middle would make (`expected`) query plans 
look pretty ugly.
   
   So consider this is two things that can cancel each other out:
   1. Add projections to fix output attribute order;
   2. If one of these extra projections is in the middle, get rid of it.
   
   The cleanup (2) is only meant to clean extra projections created because of 
(1). Both (1) and (2) are in this PR so I don't consider this as a performance 
improvement over existing stuff.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to