[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #36304: [SPARK-38959][SQL] DS V2: Support runtime group filtering in row-level commands

GitBox Thu, 06 Oct 2022 18:56:41 -0700


dongjoon-hyun commented on code in PR #36304:
URL: https://github.com/apache/spark/pull/36304#discussion_r989616936



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -412,6 +412,21 @@ object SQLConf {
       .longConf
       .createWithDefault(67108864L)
 
+  val RUNTIME_ROW_LEVEL_OPERATION_GROUP_FILTER_ENABLED =
+    
buildConf("spark.sql.optimizer.runtime.rowLevelOperationGroupFilter.enabled")
+      .doc("Enables runtime group filtering for group-based row-level 
operations. " +
+        "Data sources that replace groups of data (e.g. files, partitions) may 
prune entire " +
+        "groups using provided data source filters when planning a row-level 
operation scan. " +
+        "However, such filtering is limited as not all expressions can be 
converted into data " +
+        "source filters and some expressions can only be evaluated by Spark 
(e.g. subqueries). " +
+        "Since rewriting groups is expensive, Spark can execute a query at 
runtime to find what " +
+        "records match the condition of the row-level operation. The 
information about matching " +
+        "records will be passed back to the row-level operation scan, allowing 
data sources to " +
+        "discard groups that don't have to be rewritten.")
+      .version("3.4.0")
+      .booleanConf
+      .createWithDefault(true)

Review Comment:
   I agree with starting with `true` in this case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #36304: [SPARK-38959][SQL] DS V2: Support runtime group filtering in row-level commands

Reply via email to