zhuliquan commented on PR #11455:
URL: https://github.com/apache/datafusion/pull/11455#issuecomment-2248130513

   @alamb I thought a idea that we can attach lru-cache (key is regex, value is 
result of compiled regex) to struct `RegexpLikeFunc`. 
   ```rust
   #[derive(Debug)]
   pub struct RegexpLikeFunc {
       lru_cache: LruCache<String, regex::Regex>,
       signature: Signature,
   }
   impl Default for RegexpLikeFunc {
       fn default() -> Self {
           Self::new()
       }
   }
   
   impl RegexpLikeFunc {
       pub fn new() -> Self {
           use DataType::*;
           Self {
               signature: Signature::one_of(
                   vec![
                       Exact(vec![Utf8, Utf8]),
                       Exact(vec![LargeUtf8, Utf8]),
                       Exact(vec![Utf8, Utf8, Utf8]),
                       Exact(vec![LargeUtf8, Utf8, Utf8]),
                   ],
                   Volatility::Immutable,
               ),
               lru_cache: LruCache::new(NonZeroUsize::new(1024).unwrap()),
           }
       }
   }
   ```
   We can use cache like way in `regexp_is_match_utf8`.
   
https://github.com/apache/arrow-rs/blob/af40ea382275dba967bfabc1632fded07d2129b9/arrow-string/src/regexp.rs#L50
   
https://github.com/apache/arrow-rs/blob/af40ea382275dba967bfabc1632fded07d2129b9/arrow-string/src/regexp.rs#L83-L95
   
   it's make full use of result of compiled regex. and this way can be applied 
for scalar or array two cases. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to