gatorsmile commented on a change in pull request #23285: [SPARK-26224][SQL] 
Warn and advice the user when creating many project on subsequent calls to 
withColumn
URL: https://github.com/apache/spark/pull/23285#discussion_r247584595
 
 

 ##########
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
 ##########
 @@ -1621,6 +1621,17 @@ object SQLConf {
     .intConf
     .createWithDefault(25)
 
+  val MAX_WITHCOLUMN_PROJECTS = buildConf("spark.sql.withColumn.maxProjects")
+    .internal()
+    .doc("Maximum number of projects on top of a plan when using 
`Dataset.withColumn`. When the " +
+      "threshold is exceeded, warnings are emitted in order to advice a 
different approach. " +
+      "This usually happens when adding too many columns in loops using 
withColumns. Indeed, " +
+      "this pattern can lead to serious performance issues and even OOM. Set 
to 0 in order to " +
+      "disable completely the check.")
+    .intConf
+    .checkValue(_ >= 0, "The max number of projects cannot be negative.")
+    .createWithDefault(50)
 
 Review comment:
   `50`? Before we doing anything, could you first show the perf number?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to