[GitHub] [spark] cloud-fan commented on a change in pull request #29804: [SPARK-32859][SQL] Introduce physical rule to decide bucketing dynamically

GitBox Mon, 05 Oct 2020 23:29:52 -0700


cloud-fan commented on a change in pull request #29804:
URL: https://github.com/apache/spark/pull/29804#discussion_r500033528




##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -951,6 +951,17 @@ object SQLConf {
     .checkValue(_ > 0, "the value of spark.sql.sources.bucketing.maxBuckets 
must be greater than 0")
     .createWithDefault(100000)
 
+  val AUTO_BUCKETED_SCAN_ENABLED =
+    buildConf("spark.sql.sources.bucketing.autoBucketedScan.enabled")
+      .doc("When true, decide whether to do bucketed scan on input tables 
based on query plan " +
+        "automatically. Do not use bucketed scan if 1. query does not have 
operators to utilize " +
+        "bucketing (e.g. join, group-by, etc), or 2. there's an exchange 
operator between these " +
+        s"operators and table scan. Note when '${BUCKETING_ENABLED.key}' is 
set to " +
+        "false, this configuration does not take any effect.")
+      .version("3.1.0")
+      .booleanConf
+      .createWithDefault(false)

Review comment:
       we can follow AQE and only disable it for table cache. See 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala#L82




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on a change in pull request #29804: [SPARK-32859][SQL] Introduce physical rule to decide bucketing dynamically

Reply via email to