MohamedAbdeen21 commented on code in PR #1835:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1835#discussion_r2082424681
##########
src/tokenizer.rs:
##########
@@ -1281,20 +1262,91 @@ impl<'a> Tokenizer<'a> {
return Ok(Some(Token::make_word(s.as_str(),
None)));
}
} else if prev_token == Some(&Token::Period) {
- // If the previous token was a period, thus not
belonging to a number,
- // the value we have is part of an identifier.
+ // Handle as word if it follows a period
return Ok(Some(Token::make_word(s.as_str(),
None)));
}
}
+ // Handle "L" suffix for long numbers
let long = if chars.peek() == Some(&'L') {
chars.next();
true
} else {
false
};
+
+ // Return the final token for the number
Ok(Some(Token::Number(s, long)))
}
+
+ // Period (`.`) handling
+ '.' => {
+ chars.next(); // consume the dot
+
+ match chars.peek() {
+ // Handle "._" case as a period followed by identifier
+ // if the last token was a word
+ Some('_') if matches!(prev_token,
Some(Token::Word(_))) => {
+ Ok(Some(Token::Period))
+ }
+ Some('_') => {
+ self.tokenizer_error(
+ chars.location(),
+ "Unexpected underscore here".to_string(),
+ )
+ }
Review Comment:
without the errors, '._123' as a whole will be parsed as a number, which is
not valid in any standard AFAICT.
'._abc' as a whole would also be parsed as a word, which doesn't make sense
because the prev token was not a word.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]