dongjoon-hyun commented on a change in pull request #29950:
URL: https://github.com/apache/spark/pull/29950#discussion_r522511754
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -1963,6 +1963,27 @@ object SQLConf {
.booleanConf
.createWithDefault(true)
+ val MAX_COMMON_EXPRS_IN_COLLAPSE_PROJECT =
+ buildConf("spark.sql.optimizer.maxCommonExprsInCollapseProject")
+ .doc("An integer number indicates the maximum allowed number of common
input expression " +
+ "from lower Project when being collapsed into upper Project by
optimizer rule " +
+ "`CollapseProject`. Normally `CollapseProject` will collapse adjacent
Project " +
+ "and merge expressions. But in some edge cases, expensive expressions
might be " +
+ "duplicated many times in merged Project by this optimization. This
config sets " +
+ "a maximum number. Once an expression is duplicated more than this
number " +
+ "if merging two Project, Spark SQL will skip the merging. Note that
normally " +
+ "in whole-stage codegen Project operator will de-duplicate expressions
internally, " +
+ "but in edge cases Spark cannot do whole-stage codegen and fallback to
interpreted " +
+ "mode. In such cases, users can use this config to avoid duplicate
expressions. " +
+ "Note that even users exclude `CollapseProject` rule using " +
+ "`spark.sql.optimizer.excludedRules`, at physical planning phase Spark
will still " +
+ "collapse projections. This config is also effective on collapsing
projections in " +
+ "the physical planning.")
+ .version("3.1.0")
+ .intConf
+ .checkValue(_ > 0, "The value of maxCommonExprsInCollapseProject must be
larger than zero.")
+ .createWithDefault(20)
Review comment:
If possible, can we introduce this configuration with `Int.MaxValue` in
`3.1.0` first? We can reduce it later.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]