cloud-fan commented on a change in pull request #30808:
URL: https://github.com/apache/spark/pull/30808#discussion_r544787602



##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/SparkSessionExtensions.scala
##########
@@ -200,19 +200,20 @@ class SparkSessionExtensions {
     optimizerRules += builder
   }
 
-  private[this] val dataSourceRewriteRules = mutable.Buffer.empty[RuleBuilder]
+  private[this] val postOperatorOptimizationRules = 
mutable.Buffer.empty[RuleBuilder]
 
-  private[sql] def buildDataSourceRewriteRules(session: SparkSession): 
Seq[Rule[LogicalPlan]] = {
-    dataSourceRewriteRules.map(_.apply(session)).toSeq
+  private[sql] def buildPostOperatorOptimizationRules(
+      session: SparkSession): Seq[Rule[LogicalPlan]] = {
+    postOperatorOptimizationRules.map(_.apply(session)).toSeq
   }
 
   /**
-   * Inject an optimizer `Rule` builder that rewrites data source plans into 
the [[SparkSession]].
+   * Inject an optimizer `Rule` builder that rewrites logical plans into the 
[[SparkSession]].
    * The injected rules will be executed after the operator optimization batch 
and before rules
    * that depend on stats.
    */
-  def injectDataSourceRewriteRule(builder: RuleBuilder): Unit = {
-    dataSourceRewriteRules += builder
+  def injectPostOperatorOptimizationRule(builder: RuleBuilder): Unit = {

Review comment:
       It's unfortunate that we don't have a clear separation between RBO and 
CBO. There are RBO rules before and after the only CBO rule 
`CostBasedJoinReorder`.
   
   I think the general idea of this batch is to allow people to inject special 
optimizer rules that can't be run together with the main operator optimizer 
batch. It's indeed a Spark specific thing, as the main operator optimizer batch 
will be run many times until reaching the fixed point, the new batch added here 
will be run only once.
   
   It's really hard to do the naming here. To match the actual purpose and to 
be general, how about `Phase 2 Optimizer Rules` or `Run Once Optimizer Rules`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to