appletreeisyellow opened a new pull request, #7262:
URL: https://github.com/apache/arrow-datafusion/pull/7262

   ## Which issue does this PR close?
   
   Closes #5471.
   
   ## Rationale for this change
   
   Running `upper(col)` where `col` is a dictionary results in an internal 
error:
   
   ```
   Internal error: The "upper" function can only accept strings.. This was 
likely caused by a bug in DataFusion's code and we would welcome that you file 
an bug report in our issue tracker
   ```
   
   Other functions like `length` and `character_length` also have the same 
issue. Here is a list of all the functions with the same issue:
   
   ```
   $ grep utf8_to_str_type
   
   datafusion/expr/src/built_in_function.rs
   600:                utf8_to_str_type(&input_expr_types[0], "btrim")
   628:                utf8_to_str_type(&input_expr_types[0], "initcap")
   630:            BuiltinScalarFunction::Left => 
utf8_to_str_type(&input_expr_types[0], "left"),
   632:                utf8_to_str_type(&input_expr_types[0], "lower")
   634:            BuiltinScalarFunction::Lpad => 
utf8_to_str_type(&input_expr_types[0], "lpad"),
   636:                utf8_to_str_type(&input_expr_types[0], "ltrim")
   638:            BuiltinScalarFunction::MD5 => 
utf8_to_str_type(&input_expr_types[0], "md5"),
   651:                utf8_to_str_type(&input_expr_types[0], "regex_replace")
   654:                utf8_to_str_type(&input_expr_types[0], "repeat")
   657:                utf8_to_str_type(&input_expr_types[0], "replace")
   660:                utf8_to_str_type(&input_expr_types[0], "reverse")
   663:                utf8_to_str_type(&input_expr_types[0], "right")
   665:            BuiltinScalarFunction::Rpad => 
utf8_to_str_type(&input_expr_types[0], "rpad"),
   667:                utf8_to_str_type(&input_expr_types[0], "rtrimp")
   711:                utf8_to_str_type(&input_expr_types[0], "split_part")
   718:                utf8_to_str_type(&input_expr_types[0], "substr")
   740:                utf8_to_str_type(&input_expr_types[0], "translate")
   742:            BuiltinScalarFunction::Trim => 
utf8_to_str_type(&input_expr_types[0], "trim"),
   744:                utf8_to_str_type(&input_expr_types[0], "upper")
   ```
   
   ```
   $ grep utf8_to_int_type
   
   datafusion/expr/src/built_in_function.rs
   597:                utf8_to_int_type(&input_expr_types[0], "bit_length")
   603:                utf8_to_int_type(&input_expr_types[0], 
"character_length")
   645:                utf8_to_int_type(&input_expr_types[0], "octet_length")
   715:                utf8_to_int_type(&input_expr_types[0], "strpos")
   ```
   
   ## What changes are included in this PR?
   
   Support `Dictionary` data type for string functions and int functions
   
   ## Are these changes tested?
   
   <!--
   We typically require tests for all PRs in order to:
   1. Prevent the code from being accidentally broken by subsequent changes
   2. Serve as another way to document the expected behavior of the code
   
   If tests are not included in your PR, please explain why (for example, are 
they covered by existing tests)?
   -->
   
   Yes, tests are added for all the functions listed above
   
   ## Are there any user-facing changes?
   
   <!--
   If there are user-facing changes then we may require documentation to be 
updated before approving the PR.
   -->
   
   <!--
   If there are any breaking changes to public APIs, please add the `api 
change` label.
   -->
   
   User will be able to use `upper` function and other string and int functions 
where `col` is a dictionary without problem


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to