[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

GitBox Thu, 12 Nov 2020 21:37:09 -0800


dongjoon-hyun commented on a change in pull request #29950:
URL: https://github.com/apache/spark/pull/29950#discussion_r522665920




##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##########
@@ -724,20 +725,17 @@ object ColumnPruning extends Rule[LogicalPlan] {
 /**
  * Combines two [[Project]] operators into one and perform alias substitution,
  * merging the expressions into one single expression for the following cases.
- * 1. When two [[Project]] operators are adjacent.
+ * 1. When two [[Project]] operators are adjacent, if the number of common 
expressions in the
+ *    combined [[Project]] is not more than 
`spark.sql.optimizer.maxCommonExprsInCollapseProject`.
  * 2. When two [[Project]] operators have LocalLimit/Sample/Repartition 
operator between them
  *    and the upper project consists of the same number of columns which is 
equal or aliasing.
  *    `GlobalLimit(LocalLimit)` pattern is also considered.
  */
 object CollapseProject extends Rule[LogicalPlan] with AliasHelper {
 
-  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
-    case p1 @ Project(_, p2: Project) =>
-      if (haveCommonNonDeterministicOutput(p1.projectList, p2.projectList)) {
-        p1
-      } else {
-        p2.copy(projectList = buildCleanedProjectList(p1.projectList, 
p2.projectList))
-      }
+  def apply(plan: LogicalPlan): LogicalPlan = plan transformDown {

Review comment:
       Is there a reason to change from `transformUp` to `transformDown`? If 
the all test passed, it would be safe if we keep the original one.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

Reply via email to