stuartcarnie opened a new issue, #5154:
URL: https://github.com/apache/arrow-datafusion/issues/5154

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   
   IOx uses the `Dictionary(Int32, Utf8)` data type to represent tag columns. 
Using regular expressions to filter on tag columns is confusing to users, as 
certain queries succeed and others fail, the reasons of which may not be 
obvious to users. Using `EXPLAIN` reveals the reason regular expressions, such 
as the following succeed:
   
   ```sql
   SELECT usage_idle FROM cpu WHERE cpu ~ '9'
   ```
   
   In this case, the optimiser changes the filter plan from a regex conditional 
to `LIKE '%9%'`, and the `LIKE` operator has additional code to coerce 
dictionary types:
   
   
https://github.com/apache/arrow-datafusion/blob/031534d94efb305eb26a7c16fd7e06ae3bcd88bb/datafusion/expr/src/type_coercion/binary.rs#L522
   
   however, other cases unexpectedly fail:
   
   ```sql
   SELECT usage_idle FROM cpu WHERE cpu ~ '9$'
   ```
   
   as the optimiser does not rewrite this to a `LIKE` expression[^1], and 
regular expression operators use a string coercion rule:
   
   
https://github.com/apache/arrow-datafusion/blob/b6dbb8d8b896861d23dcc17a8a4b3e0e4276db7e/datafusion/expr/src/type_coercion/binary.rs#L141-L144
   
   [^1]: Incidentally, the optimiser could also rewrite this to `LIKE '%9'`
   
   
   **Describe the solution you'd like**
   
   Teach DataFusion how to coerce compatible `Dictionary(_, _)` types to a 
string type, such that the regular expression condition can succeed.
   
   **Describe alternatives you've considered**
   
   No alternatives considered, given there is prior art for the `LIKE` 
operator, which also coerces `Dictionary(_, _)` types to strings:
   
   
https://github.com/apache/arrow-datafusion/blob/031534d94efb305eb26a7c16fd7e06ae3bcd88bb/datafusion/expr/src/type_coercion/binary.rs#L522
   
   **Additional context**
   Add any other context or screenshots about the feature request here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to