albertlockett opened a new issue, #9870:
URL: https://github.com/apache/arrow-rs/issues/9870

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   <!--
   A clear and concise description of what the problem is. Ex. I'm always 
frustrated when [...] 
   (This section helps Arrow developers understand the context and *why* for 
this feature, in addition to  the *what*)
   -->
   
   In [otel-arrow](https://github.com/open-telemetry/otel-arrow), we're 
building a 
[query-engine](https://github.com/open-telemetry/otel-arrow/tree/main/rust/otap-dataflow/crates/query-engine)
 based on arrow & datafusion. There are two features we'd like to support:
   a) perform a case insensitive match on telemetry attribute's key (a string)
   b) perform case insensitive match on some other string value
   
   This was implemented in 
https://github.com/open-telemetry/otel-arrow/pull/2501 using `ilike` and 
escaping the special like characters (&, _, \\) (see 
[here](https://github.com/albertlockett/otel-arrow/blob/52aabc38a1cd153513ae171b7919e8c7b619c182/rust/otap-dataflow/crates/query-engine/src/pipeline/filter.rs#L387-L403)).
 This is not ideal if the text we're comparing against has these escaped 
characters because, if it does, the comparison gets done using a regex match 
(which is slower) instead of simply using `eq_ignore_ascii_case`:
   
https://github.com/apache/arrow-rs/blob/7ad2299e8cc1be2af4648fa7df412c3338fa3b3c/arrow-string/src/predicate.rs#L68-L81
   
   I'm thinking if I could simply expose a way to evaluate 
`Predicate::IEqAscii` on my arrays, it would be simple for me to write a 
ScalarUDF to achieve what I need in my query-engine.
   
   **Describe the solution you'd like**
   <!--
   A clear and concise description of what you want to happen.
   -->
   
   I'd like if we could expose a `like::eq_ignore_ascii_case` function from the 
arrow-string care that does a equality comparison on two string `Datum`s using 
a case insensitive ascii match.
   
   **Describe alternatives you've considered**
   <!--
   A clear and concise description of any alternative solutions or features 
you've considered.
   -->
   
   - Ilike and escape (can have performance overhead when there are special 
characters)
   - Duplicate the predicate code into my query-engine (fixing in arrow-string 
seemed like less work)
   
   **Additional context**
   <!--
   Add any other context or screenshots about the feature request here.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to