Mazen-Ghanaym commented on PR #3000: URL: https://github.com/apache/datafusion-comet/pull/3000#issuecomment-3694126240
> Thank you for the PR @Mazen-Ghanaym . Any reason why we can't make the Datafusion's version faster here? I spent around 4 days trying to optimize, and here is the short story of the journey Day 1-2: Tried DataFusion built-ins Started with ScalarFunctionExpr using DataFusion's native starts_with/ends_with. Result: 1.0X – just matched Spark, no improvement. Day 2: Tried unsafe Rust with direct byte access Attempted raw pointer manipulation for maximum speed. Failed due to compatibility issues. Day 3: Tried safe Rust with stdlib slices. Used `.starts_with()` and `.ends_with()` on string slices with manual iteration. Result: 0.9X – actually slower than Spark, I think it's due to iterator overhead. Day 3-4: Arrow compute kernels + pre-allocated pattern The breakthrough: I realized the overhead came from repeatedly processing the pattern each batch. Pre-allocating the pattern as a `StringArray` once and calling Arrow's `compute::starts_with` directly gave us 1.1X for `startsWith`. For `endsWith`, Arrow's kernel was still slightly slower (0.9X), so I went with direct buffer access and manual suffix calculation to reach 1.0X parity. I tried to optimize further but I don't know any optimizations that can beat Java in these direct, simple operations. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
