cyb70289 commented on a change in pull request #11376:
URL: https://github.com/apache/arrow/pull/11376#discussion_r725765311



##########
File path: cpp/src/arrow/util/utf8.h
##########
@@ -210,44 +210,37 @@ inline bool ValidateUTF8(const uint8_t* data, int64_t 
size) {
   return ARROW_PREDICT_TRUE(state == internal::kUTF8ValidateAccept);
 }
 
-inline bool ValidateUTF8(const util::string_view& str) {
+static inline bool ValidateUTF8(const util::string_view& str) {
   const uint8_t* data = reinterpret_cast<const uint8_t*>(str.data());
   const size_t length = str.size();
 
   return ValidateUTF8(data, length);
 }
 
-inline bool ValidateAsciiSw(const uint8_t* data, int64_t len) {
+static inline bool ValidateAsciiSw(const uint8_t* data, int64_t len) {

Review comment:
       Original code unrolls loop manually expecting to make better use of cpu 
pipeline. But it prevents the compiler to do better optimization leveraging 
auto vectorization.
   
   Actually, this simple code performs same as simd code, if build with clang.
   But gcc has big regression. So I left the simd code untouched.
   - _clang_, simd no better than naive code
   https://quick-bench.com/q/R5S9gDfyCzxs4pyQO3rLszMLCBI
   - _gcc_, simd is faster
   https://quick-bench.com/q/Xes5_3-CGjJbYNikY0E0BWdfVTo




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to