zhuliquan commented on PR #11455: URL: https://github.com/apache/datafusion/pull/11455#issuecomment-2248130513
@alamb I thought a idea that we can attach lru-cache (key is regex, value is result of compiled regex) to struct `RegexpLikeFunc`. ```rust #[derive(Debug)] pub struct RegexpLikeFunc { lru_cache: LruCache<String, regex::Regex>, signature: Signature, } impl Default for RegexpLikeFunc { fn default() -> Self { Self::new() } } impl RegexpLikeFunc { pub fn new() -> Self { use DataType::*; Self { signature: Signature::one_of( vec![ Exact(vec![Utf8, Utf8]), Exact(vec![LargeUtf8, Utf8]), Exact(vec![Utf8, Utf8, Utf8]), Exact(vec![LargeUtf8, Utf8, Utf8]), ], Volatility::Immutable, ), lru_cache: LruCache::new(NonZeroUsize::new(1024).unwrap()), } } } ``` We can use cache like way in `regexp_is_match_utf8`. https://github.com/apache/arrow-rs/blob/af40ea382275dba967bfabc1632fded07d2129b9/arrow-string/src/regexp.rs#L50 https://github.com/apache/arrow-rs/blob/af40ea382275dba967bfabc1632fded07d2129b9/arrow-string/src/regexp.rs#L83-L95 it's make full use of result of compiled regex. and this way can be applied for scalar or array two cases. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org