pitrou commented on a change in pull request #7357:
URL: https://github.com/apache/arrow/pull/7357#discussion_r438707793
##########
File path: cpp/src/arrow/compute/kernels/scalar_string.cc
##########
@@ -48,6 +48,16 @@ struct AsciiUpper {
}
};
+struct AsciiLower {
+ template <typename... Ignored>
+ static std::string Call(KernelContext*, const util::string_view& val) {
+ std::string result = val.to_string();
+ std::transform(result.begin(), result.end(), result.begin(),
+ [](unsigned char c) { return std::tolower(c); });
Review comment:
Also, please fix `AsciiUpper` similarly.
##########
File path: cpp/src/arrow/compute/kernels/scalar_string.cc
##########
@@ -48,6 +48,16 @@ struct AsciiUpper {
}
};
+struct AsciiLower {
+ template <typename... Ignored>
+ static std::string Call(KernelContext*, const util::string_view& val) {
+ std::string result = val.to_string();
+ std::transform(result.begin(), result.end(), result.begin(),
+ [](unsigned char c) { return std::tolower(c); });
Review comment:
Please don't use `std::tolower` as it has locale-dependent behaviour.
Instead, just create a hardcoded lookup table.
##########
File path: cpp/src/arrow/compute/kernels/scalar_string.cc
##########
@@ -48,6 +48,16 @@ struct AsciiUpper {
}
};
+struct AsciiLower {
+ template <typename... Ignored>
+ static std::string Call(KernelContext*, const util::string_view& val) {
+ std::string result = val.to_string();
Review comment:
This is completely inefficient. We should write directly into the
allocated array. The way this is architected needs rethinking.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]