HaoYang670 commented on issue #1531: URL: https://github.com/apache/arrow-rs/issues/1531#issuecomment-1098632899
Hi @viirya. In general: Speed: `substring by byte unchecked` >= `substring by byte checked` >> `substring by char` Safety: `substring by char` > `substring by byte checked` > `substring by byte unchecked`. (If the input array has an invalid utf8 string, `substring by byte checked` might not find it, but `substring by char` can always find it) User-friendly: `substring by char` > `substring by byte checked` > `substring by byte unchecked` That's why we need 3 versions of `substring`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
