Shekharrajak opened a new pull request, #2991:
URL: https://github.com/apache/datafusion-comet/pull/2991

   
   ## Which issue does this PR close?
   
   Closes #2972.
   
   
   ## Rationale for this change
   
   The contains expression shows poor performance in Comet (0.2X vs Spark) 
because DataFusion's make_scalar_function wrapper expands scalar patterns to 
arrays, bypassing arrow-rs's optimized scalar path.
   
   ## What changes are included in this PR?
   
   * Add SparkContains UDF with optimized scalar pattern handling using 
memchr::memmem::Finder for SIMD-accelerated substring search
   * Register the function in comet_scalar_funcs.rs to override DataFusion's 
built-in contains
   * Add contains to CometStringExpressionBenchmark
   * Enhance contains test in CometExpressionSuite
   
   ## How are these changes tested?
   
   * 4 new unit tests in contains.rs (array-scalar, scalar-scalar, null 
handling, empty pattern)
   * Enhanced integration test in CometExpressionSuite.scala
   * All 122 spark-expr tests pass


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to