Re: [PR] [SPARK-45400][SQL][DOCS] Refer to the unescaping rules from expression descriptions [spark]

via GitHub Thu, 05 Oct 2023 07:28:42 -0700


cloud-fan commented on code in PR #43203:
URL: https://github.com/apache/spark/pull/43203#discussion_r1347515797



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala:
##########
@@ -92,11 +92,14 @@ abstract class StringRegexExpression extends 
BinaryExpression
           _ matches any one character in the input (similar to . in posix 
regular expressions)\
           % matches zero or more characters in the input (similar to .* in 
posix regular
           expressions)<br><br>
-          Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
-          to match "\abc", the pattern should be "\\abc".<br><br>
+          Since Spark 2.0, string literals are unescaped in our SQL parser, 
see the unescaping
+          rules at <a 
href="https://spark.apache.org/docs/latest/sql-ref-literals.html#string-literal";>String
 Literal</a>.
+          For example, in order to match "\abc", the pattern should be 
"\\abc".<br><br>
           When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it falls back
           to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
-          enabled, the pattern to match "\abc" should be "\abc".
+          enabled, the pattern to match "\abc" should be "\abc".<br><br>
+          The `pattern` argument might be a raw string literal (with the `r` 
prefix) to avoid

Review Comment:
   I think we should be more strong-opinioned here:
   ```
   It's recommended to use a raw string literal (with the `r` prefix) to avoid 
escaping special characters in the pattern string if exists.
   ```
   Then we add some examples to use raw string literal (should be the same 
behavior with turning off `escapedStringLiterals`)



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala:
##########
@@ -92,11 +92,14 @@ abstract class StringRegexExpression extends 
BinaryExpression
           _ matches any one character in the input (similar to . in posix 
regular expressions)\
           % matches zero or more characters in the input (similar to .* in 
posix regular
           expressions)<br><br>
-          Since Spark 2.0, string literals are unescaped in our SQL parser. 
For example, in order
-          to match "\abc", the pattern should be "\\abc".<br><br>
+          Since Spark 2.0, string literals are unescaped in our SQL parser, 
see the unescaping
+          rules at <a 
href="https://spark.apache.org/docs/latest/sql-ref-literals.html#string-literal";>String
 Literal</a>.
+          For example, in order to match "\abc", the pattern should be 
"\\abc".<br><br>
           When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, 
it falls back
           to Spark 1.6 behavior regarding string literal parsing. For example, 
if the config is
-          enabled, the pattern to match "\abc" should be "\abc".
+          enabled, the pattern to match "\abc" should be "\abc".<br><br>
+          The `pattern` argument might be a raw string literal (with the `r` 
prefix) to avoid

Review Comment:
   I think we should be more strong-opinioned here:
   ```
   It's recommended to use a raw string literal (with the `r` prefix) to
   avoid escaping special characters in the pattern string if exists.
   ```
   Then we add some examples to use raw string literal (should be the same 
behavior with turning off `escapedStringLiterals`)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-45400][SQL][DOCS] Refer to the unescaping rules from expression descriptions [spark]

Reply via email to