[GitHub] [spark] rdblue commented on a change in pull request #30558: [SPARK-33612][SQL] Add dataSourceRewriteRules batch to Optimizer

GitBox Tue, 15 Dec 2020 09:34:38 -0800


rdblue commented on a change in pull request #30558:
URL: https://github.com/apache/spark/pull/30558#discussion_r543546112




##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##########
@@ -185,6 +185,9 @@ abstract class Optimizer(catalogManager: CatalogManager)
       RemoveLiteralFromGroupExpressions,
       RemoveRepetitionFromGroupExpressions) :: Nil ++
     operatorOptimizationBatch) :+
+    // This batch rewrites data source plans and should be run after the 
operator
+    // optimization batch and before any batches that depend on stats.
+    Batch("Data Source Rewrite Rules", Once, dataSourceRewriteRules: _*) :+

Review comment:
       I don't think that `preCBORules` is a good name. This batch is for 
rewrites that need to happen after basic optimization simplifies expressions 
and then pushes filters and projections. It also needs to happen before early 
pushdown, which in turn needs to come before CBO. All of that is before CBO, 
and that name doesn't capture what this is to be used for.
   
   A more descriptive name is "planRewriteRules" because this is for rewriting 
plans after initial optimization, but before other optimizer rules that need to 
run after that rewrite, like early pushdown, CBO, etc.
   
   The name "postOperatorOptimizationRules" is okay, but not very descriptive.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] rdblue commented on a change in pull request #30558: [SPARK-33612][SQL] Add dataSourceRewriteRules batch to Optimizer

Reply via email to