cyb70289 commented on a change in pull request #11674:
URL: https://github.com/apache/arrow/pull/11674#discussion_r748054816
##########
File path: cpp/src/arrow/util/bitmap_ops.cc
##########
@@ -249,59 +249,92 @@ void AlignedBitmapOp(const uint8_t* left, int64_t
left_offset, const uint8_t* ri
left += left_offset / 8;
right += right_offset / 8;
out += out_offset / 8;
+ uint64_t outPopCount = 0;
for (int64_t i = 0; i < nbytes; ++i) {
out[i] = op(left[i], right[i]);
+ if (ComputeNewValidityCount) {
+ outPopCount += BitUtil::kBytePopcount[out[i]];
Review comment:
> Here is the godbolt snippet with vectorization flags showing same type
of vectorization happening with pop-count and without.
https://godbolt.org/z/bGhdGEjr6 But I will check with gcc/msvc too
In your test code, line 34~43.
The problem is that `v` is calculated but never used. So compiler will
simply delete all the code calculating it.
Change the function to let it return `v`, the generated code will be totally
different.
```cpp
void WithPopcount(const uint8_t* a, const uint8_t* b, const int N) {
std::vector<uint8_t> c;
c.reserve(N);
unsigned int v = 0;
for (int i = 0; i < N; ++i) {
c[i] = a[i] & b[i];
v += table[c[i]];
}
}
```
> Yeah, M/s is my mistake. Those are numbers from debug build I did, I
forgot to switch to release. With release, the numbers are much higher. I shall
update the description with new numbers soon and with benchmark labels.
No worry, it happens.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]