cyb70289 commented on a change in pull request #11674:
URL: https://github.com/apache/arrow/pull/11674#discussion_r748054816



##########
File path: cpp/src/arrow/util/bitmap_ops.cc
##########
@@ -249,59 +249,92 @@ void AlignedBitmapOp(const uint8_t* left, int64_t 
left_offset, const uint8_t* ri
   left += left_offset / 8;
   right += right_offset / 8;
   out += out_offset / 8;
+  uint64_t outPopCount = 0;
   for (int64_t i = 0; i < nbytes; ++i) {
     out[i] = op(left[i], right[i]);
+    if (ComputeNewValidityCount) {
+      outPopCount += BitUtil::kBytePopcount[out[i]];

Review comment:
       > Here is the godbolt snippet with vectorization flags showing same type 
of vectorization happening with pop-count and without. 
https://godbolt.org/z/bGhdGEjr6 But I will check with gcc/msvc too
   
   In your test code, line 34~43.
   The problem is that `v` is calculated but never used. So compiler will 
simply delete all the code calculating it.
   Change the function to let it return `v`, the generated code will be totally 
different.
   
   ```cpp
   void WithPopcount(const uint8_t* a, const uint8_t* b, const int N) {
     std::vector<uint8_t> c;
     c.reserve(N);
   
     unsigned int v = 0;
     for (int i = 0; i < N; ++i) {
       c[i] = a[i] & b[i];
       v += table[c[i]];
     }
   }
   ```
   
   > Yeah, M/s is my mistake. Those are numbers from debug build I did, I 
forgot to switch to release. With release, the numbers are much higher. I shall 
update the description with new numbers soon and with benchmark labels.
   
   No worry, it happens.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to