HaoYang670 commented on issue #1531:
URL: https://github.com/apache/arrow-rs/issues/1531#issuecomment-1098747941
We can design a situation where `substring by byte checked` is the best
choice:
Suppose we want to get the substrings with the **byte** length of the
shortest string in the array. We could do in this way by using the `length`
kernel and the `aggregation` kernel:
```rust
substring(string_array, start = 0, length = Some(min(length(string_array))))
```
Then, give an input `string_array` like this:
```rust
let input = [
"14 bytes 🗻",
"This string has 24 bytes",
"🗻 13 bytes"
];
```
We will get an invalid string array if using `substring by byte unchecked`,
which is dangerous.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]