alamb commented on issue #8051:
URL: 
https://github.com/apache/arrow-datafusion/issues/8051#issuecomment-1979747257

   BTW now that @jayzhan211  and I  have implemented `ScalarUDF::simplify` in 
https://github.com/apache/arrow-datafusion/pull/9298 and we have ported the 
regular_expression functions to use `ScalarUDF`,  I think we could actually use 
that API to implement precompiled functions
   
   Note sure if that would meet your requirements @thinkharderdev 
   
   For example, to implement "precompiled regexp functions" we could do 
something like this (would be sweet if someone wanted to prototype this): 
   
   ```rust
   /// A new UDF that has a precompiled pattern
   impl PrecompiledRegexpReplace {
     precompiled_match: Arc<Pattern>
   }
   
   impl ScalarUDFImpl for PrecompiledRegexpReplace  {
      // invoke function uses `self.precompiled_match` directly
   ...
   }
   
   
   // Update the existing RegexpReplace function to implement `simplify`
   impl ScalarUDFImpl for RegexpReplace  {
   
     /// if the pattern argument is a scalar, rewrite the function to a new 
scalar UDF that
     /// contains a pre-compiled regular-expression
     fn simplify(&self) .. { 
       match (args[1], args[2]) {
          (ScalarValue::Utf8(pattern), ScalarValue::Utf8(flags)) => {
            let pattern = // create regexp match
            SImplified::Rewritten(ScalarUdf::new(PrecompiledRegexpMatch { 
precompiled } )))
             .call(args)
          }, 
         _ => Simplified::Original(args)
     }
   }
   ```
   
   We could then run some gnarly regular expression case, such as what is found 
on https://github.com/apache/arrow-datafusion/issues/8492 and see if it helps 
or not.
   
   If it doesn't help performance, then the extra complexity isn't worth it for 
regexp_replace
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to