iffyio commented on code in PR #1705:
URL: 
https://github.com/apache/datafusion-sqlparser-rs/pull/1705#discussion_r1942311180


##########
src/dialect/mod.rs:
##########
@@ -880,6 +880,15 @@ pub trait Dialect: Debug + Any {
     fn supports_table_hints(&self) -> bool {
         false
     }
+
+    /// Returns whether it's the start of a single line comment
+    /// e.g. MySQL requires a space after `--` to be a single line comment
+    /// Otherwise it's interpreted as a double minus operator
+    ///
+    /// MySQL: 
<https://dev.mysql.com/doc/refman/8.4/en/ansi-diff-comments.html>
+    fn requires_whitespace_to_start_comment(&self) -> bool {

Review Comment:
   ```suggestion
       fn requires_single_line_comment_whitespace(&self) -> bool {
   ```
   thinking something like this to flag that this is only for the `--` comment 
style



##########
src/dialect/mysql.rs:
##########
@@ -125,6 +125,15 @@ impl Dialect for MySqlDialect {
     fn supports_table_hints(&self) -> bool {
         true
     }
+
+    /// Returns whether it's the start of a single line comment
+    /// e.g. MySQL requires a space after `--` to be a single line comment
+    /// Otherwise it's interpreted as a double minus operator
+    ///

Review Comment:
   Since this is already duplicated at the definition of the dialect method, we 
could consider not including it (The link to the doc on the other hand is 
useful in any case)



##########
src/dialect/mod.rs:
##########
@@ -880,6 +880,15 @@ pub trait Dialect: Debug + Any {
     fn supports_table_hints(&self) -> bool {
         false
     }
+
+    /// Returns whether it's the start of a single line comment

Review Comment:
   ```suggestion
       /// Returns true if this dialect requires a whitespace after `--` for 
single line comments.
   ```



##########
src/tokenizer.rs:
##########
@@ -1229,15 +1229,31 @@ impl<'a> Tokenizer<'a> {
                 // operators
                 '-' => {
                     chars.next(); // consume the '-'
-                    match chars.peek() {
-                        Some('-') => {
-                            chars.next(); // consume the second '-', starting 
a single-line comment
+
+                    if let Some('-') = chars.peek() {
+                        let mut is_comment = true;
+                        if self.dialect.requires_whitespace_to_start_comment() 
{
+                            // MySQL requires a space after the -- for a 
single-line comment
+                            // Otherwise it's interpreted as two minus signs
+                            // e.g. UPDATE account SET balance=balance--1
+                            //      WHERE account_id=5752;

Review Comment:
   Can we move the example to the doc string in the dialect method? That would 
help clarify the behavior to folks without having to dig deeper in the docs or 
the parser code.
   Then for the rest of the comment here we can probably skip since we've 
already documented as much in the dialect method



##########
src/tokenizer.rs:
##########
@@ -1229,15 +1229,31 @@ impl<'a> Tokenizer<'a> {
                 // operators
                 '-' => {
                     chars.next(); // consume the '-'
-                    match chars.peek() {
-                        Some('-') => {
-                            chars.next(); // consume the second '-', starting 
a single-line comment
+
+                    if let Some('-') = chars.peek() {

Review Comment:
   would it be possible to have this remain as a match statement, calling 
`chars.peek()` only once? i.e.
   ```rust
   match chars.peek() {
       Some('-') => { ... }
       Some('>') => { ... }
    }
   ```



##########
tests/sqlparser_mysql.rs:
##########
@@ -3244,3 +3244,62 @@ fn parse_double_precision() {
         "CREATE TABLE foo (bar DOUBLE(11,0))",
     );
 }
+
+#[test]
+fn parse_looks_like_single_line_comment() {
+    // See https://dev.mysql.com/doc/refman/8.4/en/ansi-diff-comments.html
+    match mysql().parse_sql_statements(
+        r#"
+            UPDATE account SET balance=balance--1
+            WHERE account_id=5752
+        "#,

Review Comment:
   can we use `mysql().verified_stmt()` for the tests?



##########
src/tokenizer.rs:
##########
@@ -1229,15 +1229,31 @@ impl<'a> Tokenizer<'a> {
                 // operators
                 '-' => {
                     chars.next(); // consume the '-'
-                    match chars.peek() {
-                        Some('-') => {
-                            chars.next(); // consume the second '-', starting 
a single-line comment
+
+                    if let Some('-') = chars.peek() {
+                        let mut is_comment = true;
+                        if self.dialect.requires_whitespace_to_start_comment() 
{
+                            // MySQL requires a space after the -- for a 
single-line comment
+                            // Otherwise it's interpreted as two minus signs
+                            // e.g. UPDATE account SET balance=balance--1
+                            //      WHERE account_id=5752;
+                            match chars.peekable.clone().nth(1) {
+                                Some(' ') => (),
+                                _ => is_comment = false,
+                            }

Review Comment:
   ```suggestion
   is_comment = Some(' ') == chars.peekable.clone().nth(1);
   ```
   looks like this can be simplified?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to