[GitHub] [spark] gatorsmile commented on a change in pull request #30558: [SPARK-33612][SQL] Add dataSourceRewriteRules batch to Optimizer

GitBox Tue, 01 Dec 2020 20:17:11 -0800


gatorsmile commented on a change in pull request #30558:
URL: https://github.com/apache/spark/pull/30558#discussion_r533885837




##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##########
@@ -185,6 +185,9 @@ abstract class Optimizer(catalogManager: CatalogManager)
       RemoveLiteralFromGroupExpressions,
       RemoveRepetitionFromGroupExpressions) :: Nil ++
     operatorOptimizationBatch) :+
+    // This batch rewrites data source plans and should be run after the 
operator
+    // optimization batch and before any batches that depend on stats.
+    Batch("Data Source Rewrite Rules", Once, dataSourceRewriteRules: _*) :+

Review comment:
       Basically, what you want to do is to add an extension point/batch 
between heuristics-based optimizer and cost-based optimizer. 
   
   The batch name and comments look not good to me. We need a better name here. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] gatorsmile commented on a change in pull request #30558: [SPARK-33612][SQL] Add dataSourceRewriteRules batch to Optimizer

Reply via email to