viirya commented on code in PR #8631:
URL: https://github.com/apache/arrow-datafusion/pull/8631#discussion_r1435282739
##########
datafusion/physical-expr/src/regex_expressions.rs:
##########
@@ -78,6 +79,82 @@ pub fn regexp_match<T: OffsetSizeTrait>(args: &[ArrayRef])
-> Result<ArrayRef> {
}
}
+/// TODO: Remove this once it is included in arrow-rs new release.
+fn _regexp_match<OffsetSize: OffsetSizeTrait>(
+ array: &GenericStringArray<OffsetSize>,
+ regex_array: &GenericStringArray<OffsetSize>,
+ flags_array: Option<&GenericStringArray<OffsetSize>>,
+) -> std::result::Result<ArrayRef, ArrowError> {
+ let mut patterns: std::collections::HashMap<String, Regex> =
Review Comment:
Btw, to clarify it if the long PR description is overlooked by reviewers.
This hash map is already in the arrow-rs kernel. The actual fix here is to
avoid expensive cloning of `Regex` per row. I described the reason in the
description.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]