[GitHub] [spark] maropu commented on a change in pull request #32482: [SPARK-35332][SQL] Make cache plan disable configs configurable

GitBox Sun, 09 May 2021 05:55:54 -0700


maropu commented on a change in pull request #32482:
URL: https://github.com/apache/spark/pull/32482#discussion_r628886799




##########
File path: sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala
##########
@@ -1554,4 +1554,39 @@ class CachedTableSuite extends QueryTest with 
SQLTestUtils
       assert(!spark.catalog.isCached(viewName))
     }
   }
+
+  test("SPARK-35332: Make cache plan disable configs configurable") {

Review comment:
       Could you add tests for more patterns? e.g., 
   ```
   sql("""SET spark.sql.cache.disableConfigs=spark.sql.adaptive.enabled""")
   sql("CACHE TABLE test_table1 AS <query 1>")
   spark.table("test_table1").explain(true) <=  AQE disabled
   
   sql("""SET spark.sql.cache.disableConfigs=""")
   sql("CACHE TABLE test_table2 AS <query 2>")
   spark.table("test_table2").explain(true) <=  AQE enabled
   spark.table("test_table1").explain(true) <=  AQE disabled
   sql("CACHE TABLE test_table3 AS <query 1>")
   spark.table("test_table3").explain(true) <=  AQE disabled
   ```

##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -1090,6 +1090,18 @@ object SQLConf {
       .booleanConf
       .createWithDefault(true)
 
+  val CACHE_DISABLE_CONFIGS =
+    buildConf("spark.sql.cache.disableConfigs")
+      .doc("Configurations needs to be turned off, to avoid regression for 
cached query, so that " +
+        "the outputPartitioning of the underlying cached query plan can be 
leveraged later.")
+      .version("3.2.0")
+      .stringConf
+      .toSequence
+      .checkValue(_.forall(v => sqlConfEntries.containsKey(v) &&
+        sqlConfEntries.get(v).defaultValue.exists(_.isInstanceOf[Boolean])),
+        "config should be boolean type")
+      .createWithDefault(Seq(ADAPTIVE_EXECUTION_ENABLED.key, 
AUTO_BUCKETED_SCAN_ENABLED.key))

Review comment:
       Any usecase to turn off these rules separately? I think it's okay just 
to use a boolean flag for this though.

##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -1090,6 +1090,18 @@ object SQLConf {
       .booleanConf
       .createWithDefault(true)
 
+  val CACHE_DISABLE_CONFIGS =
+    buildConf("spark.sql.cache.disableConfigs")
+      .doc("Configurations needs to be turned off, to avoid regression for 
cached query, so that " +
+        "the outputPartitioning of the underlying cached query plan can be 
leveraged later.")

Review comment:
       I think the comment is for developers, so it is difficult for a user to 
understand this description. Could you brush up it more?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #32482: [SPARK-35332][SQL] Make cache plan disable configs configurable

Reply via email to