iffyio commented on code in PR #1735:
URL: 
https://github.com/apache/datafusion-sqlparser-rs/pull/1735#discussion_r1974849855


##########
src/dialect/mod.rs:
##########
@@ -201,6 +201,33 @@ pub trait Dialect: Debug + Any {
         false
     }
 
+    /// Determine whether the dialect strips the backslash when escaping LIKE 
wildcards (%, _).
+    ///
+    /// [MySQL] has a special case when escaping single quoted strings which 
leaves these unescaped
+    /// so they can be used in LIKE patterns without double-escaping (as is 
necessary in other
+    /// escaping dialects, such as [Snowflake]). Generally, special characters 
have escaping rules
+    /// causing them to be replaced with a different byte sequences (e.g. 
`'\0'` becoming the zero
+    /// byte), and the default if an escaped character does not have a 
specific escaping rule is to
+    /// strip the backslash (e.g. there is no rule for `h`, so `'\h' = 'h'`). 
MySQL's special case
+    /// for ignoring LIKE wildcard escapes is to *not* strip the backslash, so 
that `'\%' = '\\%'`.
+    /// This applies to all string literals though, not just those used in 
LIKE patterns.
+    ///
+    /// ```text
+    /// mysql> select '\_', hex('\\'), hex('_'), hex('\_');
+    /// +----+-----------+----------+-----------+
+    /// | \_ | hex('\\') | hex('_') | hex('\_') |
+    /// +----+-----------+----------+-----------+
+    /// | \_ | 5C        | 5F       | 5C5F      |
+    /// +----+-----------+----------+-----------+
+    /// 1 row in set (0.00 sec)
+    /// ```
+    ///
+    /// [MySQL]: https://dev.mysql.com/doc/refman/8.4/en/string-literals.html
+    /// [Snowflake]: 
https://docs.snowflake.com/en/sql-reference/functions/like#usage-notes
+    fn ignores_like_wildcard_escapes(&self) -> bool {

Review Comment:
   ```suggestion
       fn ignores_wildcard_escapes(&self) -> bool {
   ```
   maybe we drop the `like` part? as the comment suggests if its nothing 
special about the `LIKE` syntax and more of a general string literal escape 
behavior



##########
src/tokenizer.rs:
##########
@@ -807,6 +807,9 @@ pub struct Tokenizer<'a> {
     /// If true (the default), the tokenizer will un-escape literal
     /// SQL strings See [`Tokenizer::with_unescape`] for more details.
     unescape: bool,
+    /// If true, the tokenizer will not escape % and _, for use in in LIKE 
patterns. See
+    /// [`Dialect::ignores_like_wildcard_escapes`] for more details.
+    ignore_like_wildcard_escapes: bool,

Review Comment:
   was it a reason to store this value here vs relying solely on the dialect 
via `self.dialect.ignores_like_wildcard_escapes()` when needed?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to