[spark] branch master updated: [SPARK-45385][SQL] Deprecate `spark.sql.parser.escapedStringLiterals`

dongjoon Sun, 01 Oct 2023 16:33:22 -0700

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new cdebe7885f7 [SPARK-45385][SQL] Deprecate 
`spark.sql.parser.escapedStringLiterals`
cdebe7885f7 is described below

commit cdebe7885f78a0437e815e14349744431f9a1c36
Author: Max Gekk <[email protected]>
AuthorDate: Sun Oct 1 16:32:00 2023 -0700

    [SPARK-45385][SQL] Deprecate `spark.sql.parser.escapedStringLiterals`
    
    ### What changes were proposed in this pull request?
    In the PR, I propose to deprecate the SQL config 
`spark.sql.parser.escapedStringLiterals`, and put to the list of deprecated 
configs: `SQLConf.deprecatedSQLConfigs`. Also I modified a test to check that 
the recommendation in the deprecation comment works actually.
    
    ### Why are the changes needed?
    The config allows to switch to legacy behaviour of Spark 1.6 which is 
pretty old and not maintained anymore. Deprecation and removing of the config 
in the future versions should improve code maintenance. Also there is an 
alternative approach by using RAW string literals in which Spark doesn't 
especially handle escaped character sequences.
    
    Also removing of the SQL config in the future should simplify 
functions/expressions description like `LIKE`, 
[link](https://spark.apache.org/docs/latest/api/sql/#like).
    
    ### Does this PR introduce _any_ user-facing change?
    No.
    
    ### How was this patch tested?
    By running the modified test suite:
    ```
    $ build/sbt "test:testOnly *DatasetSuite"
    ```
    
    ### Was this patch authored or co-authored using generative AI tooling?
    No.
    
    Closes #43187 from MaxGekk/deprecate-escapedStringLiterals.
    
    Authored-by: Max Gekk <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 .../org/apache/spark/sql/internal/SQLConf.scala    |  4 +++-
 .../scala/org/apache/spark/sql/DatasetSuite.scala  | 23 ++++++++++++++--------
 2 files changed, 18 insertions(+), 9 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
index 0cad85e1296..bc2be84d7be 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@@ -4514,7 +4514,9 @@ object SQLConf {
       DeprecatedConfig(LEGACY_REPLACE_DATABRICKS_SPARK_AVRO_ENABLED.key, "3.2",
         """Use `.format("avro")` in `DataFrameWriter` or `DataFrameReader` 
instead."""),
       DeprecatedConfig(COALESCE_PARTITIONS_MIN_PARTITION_NUM.key, "3.2",
-        s"Use '${COALESCE_PARTITIONS_MIN_PARTITION_SIZE.key}' instead.")
+        s"Use '${COALESCE_PARTITIONS_MIN_PARTITION_SIZE.key}' instead."),
+      DeprecatedConfig(ESCAPED_STRING_LITERALS.key, "4.0",
+        "Use raw string literals with the `r` prefix instead. ")
     )
 
     Map(configs.map { cfg => cfg.key -> cfg } : _*)
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala
index 32469534978..04e619fa908 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala
@@ -1871,14 +1871,21 @@ class DatasetSuite extends QueryTest
   }
 
   test("SPARK-20399: do not unescaped regex pattern when 
ESCAPED_STRING_LITERALS is enabled") {
-    withSQLConf(SQLConf.ESCAPED_STRING_LITERALS.key -> "true") {
-      val data = Seq("\u0020\u0021\u0023", "abc")
-      val df = data.toDF()
-      val rlike1 = df.filter("value rlike '^\\x20[\\x20-\\x23]+$'")
-      val rlike2 = df.filter($"value".rlike("^\\x20[\\x20-\\x23]+$"))
-      val rlike3 = df.filter("value rlike '^\\\\x20[\\\\x20-\\\\x23]+$'")
-      checkAnswer(rlike1, rlike2)
-      assert(rlike3.count() == 0)
+    Seq(
+      true ->
+        ("value rlike '^\\x20[\\x20-\\x23]+$'", "value rlike 
'^\\\\x20[\\\\x20-\\\\x23]+$'"),
+      false ->
+        ("value rlike r'^\\x20[\\x20-\\x23]+$'", "value rlike 
r'^\\\\x20[\\\\x20-\\\\x23]+$'")
+    ).foreach { case (escaped, (filter1, filter3)) =>
+      withSQLConf(SQLConf.ESCAPED_STRING_LITERALS.key -> escaped.toString) {
+        val data = Seq("\u0020\u0021\u0023", "abc")
+        val df = data.toDF()
+        val rlike1 = df.filter(filter1)
+        val rlike2 = df.filter($"value".rlike("^\\x20[\\x20-\\x23]+$"))
+        val rlike3 = df.filter(filter3)
+        checkAnswer(rlike1, rlike2)
+        assert(rlike3.count() == 0)
+      }
     }
   }
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[spark] branch master updated: [SPARK-45385][SQL] Deprecate `spark.sql.parser.escapedStringLiterals`

Reply via email to