rdblue commented on a change in pull request #30558:
URL: https://github.com/apache/spark/pull/30558#discussion_r543546112
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##########
@@ -185,6 +185,9 @@ abstract class Optimizer(catalogManager: CatalogManager)
RemoveLiteralFromGroupExpressions,
RemoveRepetitionFromGroupExpressions) :: Nil ++
operatorOptimizationBatch) :+
+ // This batch rewrites data source plans and should be run after the
operator
+ // optimization batch and before any batches that depend on stats.
+ Batch("Data Source Rewrite Rules", Once, dataSourceRewriteRules: _*) :+
Review comment:
I don't think that `preCBORules` is a good name. This batch is for
rewrites that need to happen after basic optimization simplifies expressions
and then pushes filters and projections. It also needs to happen before early
pushdown, which in turn needs to come before CBO. All of that is before CBO,
and that name doesn't capture what this is to be used for.
A more descriptive name is "planRewriteRules" because this is for rewriting
plans after initial optimization, but before other optimizer rules that need to
run after that rewrite, like early pushdown, CBO, etc.
The name "postOperatorOptimizationRules" is okay, but not very descriptive.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]