chaokunyang commented on code in PR #2033:
URL: https://github.com/apache/fury/pull/2033#discussion_r1933263849
##########
cpp/fury/util/string_util.h:
##########
@@ -204,9 +163,15 @@ inline bool isAscii(const std::string &str) {
}
inline bool isLatin1(const std::u16string &str) {
- const std::uint16_t *data =
- reinterpret_cast<const std::uint16_t *>(str.data());
- return isLatin1(data, str.size());
+ // Try the conversion directly, and all characters are considered Latin1 if
+ // they are successfully converted
+ size_t latin1_len = simdutf::latin1_length_from_utf16(str.size());
Review Comment:
The current implementation can return early if it find a non Latin
character. But the simdutf need to iterate the whole string . Maybe we should
continue to optimize the current implementation . Or does simdutf have a
similar function?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]