XiangpengHao commented on code in PR #11662:
URL: https://github.com/apache/datafusion/pull/11662#discussion_r1693060380
##########
datafusion/functions/src/utils.rs:
##########
@@ -41,8 +41,8 @@ macro_rules! get_optimal_return_type {
DataType::LargeUtf8 | DataType::LargeBinary => $largeUtf8Type,
// Binary inputs are automatically coerced to Utf8
DataType::Utf8 | DataType::Binary => $utf8Type,
- // Utf8View inputs will yield Utf8View outputs
- DataType::Utf8View => DataType::Utf8View,
+ // Utf8View max offset size is u32::MAX, the same as UTF8
Review Comment:
Used in string related `ScalarUDF`s, for example:
https://github.com/XiangpengHao/datafusion/blob/string-view2-local/datafusion/functions/src/unicode/character_length.rs#L70
I added a small unit test to the function.
Note that there are a bunch of string related functions that only accepts
`Utf8` and `LargeUtf8`, we currently rely on coerce rules to cast them, which
won't panic but may be slower than it should be. I think we should add native
support to `Utf8View`, I'm working on it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]