Mike Seddon created ARROW-11434:
-----------------------------------
Summary: Length kernel returns bytes not character length
Key: ARROW-11434
URL: https://issues.apache.org/jira/browse/ARROW-11434
Project: Apache Arrow
Issue Type: Bug
Components: Rust, Rust - DataFusion
Reporter: Mike Seddon
Assignee: Mike Seddon
The rust `length` kernel currently counts number of bytes/octets rather than
characters given that Arrow uses UTF8 encoding.
This means that the result of the `length` kernel on a string like `josé` will
be 5 bytes rather than 4 characters.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)