eyalleshem commented on issue #2036:
URL: 
https://github.com/apache/datafusion-sqlparser-rs/issues/2036#issuecomment-3678852408

   **Progress Update: 60% Performance Improvement with Borrowed Tokenizer**
   
   I've opened PR #2136 (currently WIP) implementing zero-copy borrowing for 
strings, identifiers, and comments in the tokenizer: 
https://github.com/apache/datafusion-sqlparser-rs/pull/2136
   
   ## Performance Results
   
   Using `cargo bench` with Criterion on the same ~30K string query from my 
previous tests:
   ```
   tokenization/tokenize_complex_sql
       time:   [274.24 µs 274.74 µs 275.31 µs]
       change: [−59.937% −59.826% −59.716%] (p = 0.00 < 0.05)
       Performance has improved.
   ```
   
   **~60% faster tokenization** (from ~683 µs to ~275 µs)
   
   ## Why Different from Previous Measurements?
   
   My earlier manual timing measurements showed inconsistent results. I've now 
switched to `cargo bench` with Criterion, which provides:
   - **Warmup iterations** to reach steady-state performance
   - **Statistical analysis** that identifies and filters outliers
   - **Consistent, reproducible methodology**
   
   ## Reproducibility
   
   The benchmark is included in PR #2136. To run the comparison locally:
   - **With borrowing**: PR #2136
   - **Without borrowing (baseline)**: 
https://github.com/eyalleshem/sqlparser-rs/tree/benchmark_base
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to