Mazen-Ghanaym commented on PR #3000:
URL: 
https://github.com/apache/datafusion-comet/pull/3000#issuecomment-3694126240

   > Thank you for the PR @Mazen-Ghanaym . Any reason why we can't make the 
Datafusion's version faster here?
   
   I spent around 4 days trying to optimize, and here is the short story of the 
journey
   Day 1-2: Tried DataFusion built-ins Started with ScalarFunctionExpr using 
DataFusion's native starts_with/ends_with. Result: 1.0X – just matched Spark, 
no improvement.
   
   Day 2: Tried unsafe Rust with direct byte access Attempted raw pointer 
manipulation for maximum speed. Failed due to compatibility issues.
   
   Day 3: Tried safe Rust with stdlib slices. Used `.starts_with()` and 
`.ends_with()` on string slices with manual iteration. Result: 0.9X – actually 
slower than Spark, I think it's due to iterator overhead.
   
   Day 3-4: Arrow compute kernels + pre-allocated pattern The breakthrough: I 
realized the overhead came from repeatedly processing the pattern each batch. 
Pre-allocating the pattern as a `StringArray` once and calling Arrow's 
`compute::starts_with` directly gave us 1.1X for `startsWith`.
   
   For `endsWith`, Arrow's kernel was still slightly slower (0.9X), so I went 
with direct buffer access and manual suffix calculation to reach 1.0X parity.
   
   I tried to optimize further but I don't know any optimizations that can beat 
Java in these direct, simple operations.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to