findepi commented on code in PR #6662:
URL: https://github.com/apache/arrow-rs/pull/6662#discussion_r1829979801


##########
arrow-string/src/predicate.rs:
##########
@@ -116,10 +116,17 @@ impl<'a> Predicate<'a> {
             }),
             Predicate::Contains(finder) => {
                 if let Some(string_view_array) = 
array.as_any().downcast_ref::<StringViewArray>() {
+                    let nulls = string_view_array.logical_nulls();
                     BooleanArray::from(
                         string_view_array
                             .bytes_iter()
-                            .map(|haystack| finder.find(haystack).is_some() != 
negate)
+                            .enumerate()
+                            .map(|(idx, haystack)| {
+                                if nulls.as_ref().map(|n| 
n.is_null(idx)).unwrap_or_default() {

Review Comment:
   Absolutely agreed that we should avoid perf hit if possible. Thanks @alamb 
for looking into this.
   OTOH, i don't see benefit from keeping this PR pending. The perf improvement 
will be on top of its current state anyway, starting from correct 
implementation and the additional tests.
   So if the point is to keep the PR "hostage" to encourage contributor to 
improve it, I understand that and perhaps would do the same. However, if you're 
willing to spend your own time improving it, you could as well do this after 
the merge 🙂 
   
   with this out, thanks again for looking into this ❤️ . i will try to improve 
the perf on my end too and if succeed, will report back
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to