HyukjinKwon commented on a change in pull request #23556: [SPARK-26626][SQL] 
Maximum size for repeatedly substituted aliases in SQL expressions
URL: https://github.com/apache/spark/pull/23556#discussion_r267675884
 
 

 ##########
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ##########
 @@ -658,7 +658,8 @@ object CollapseProject extends Rule[LogicalPlan] {
 
   def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
 
 Review comment:
   So, basically what you want to do is to give up a rule given a condition 
because processing huge tree causes OOM issue only in the driver. Am I correct?
   
   What's the diff if we set the threshold `spark.sql.maxRepeatedAliasSize` to 
set the specific number based upon the rough estimation vs explicitly excluding 
the rule by `spark.sql.optimizer.excludedRules` based on user's rough 
estimation?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to