Re: [PR] Prepare tokenizer for using borrowed strings instead of allocations. [datafusion-sqlparser-rs]

via GitHub Thu, 06 Nov 2025 07:43:45 -0800


eyalleshem commented on code in PR #2073:
URL: 
https://github.com/apache/datafusion-sqlparser-rs/pull/2073#discussion_r2499531757



##########
src/tokenizer.rs:
##########
@@ -912,18 +926,21 @@ impl<'a> Tokenizer<'a> {
     fn tokenize_identifier_or_keyword(
         &self,
         ch: impl IntoIterator<Item = char>,
-        chars: &mut State,
+        chars: &mut State<'a>,
     ) -> Result<Option<Token>, TokenizerError> {
         chars.next(); // consume the first char
-        let ch: String = ch.into_iter().collect();
-        let word = self.tokenize_word(ch, chars);
+                      // Calculate total byte length without allocating a 
String
+        let consumed_byte_len: usize = ch.into_iter().map(|c| 
c.len_utf8()).sum();

Review Comment:
   agree , move it out . 
   The downside is that callers now need to calculate the UTF-8 byte length of 
their characters. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Prepare tokenizer for using borrowed strings instead of allocations. [datafusion-sqlparser-rs]

Reply via email to