cyb70289 commented on a change in pull request #11674:
URL: https://github.com/apache/arrow/pull/11674#discussion_r747979858
##########
File path: cpp/src/arrow/util/bitmap_ops.cc
##########
@@ -249,59 +249,92 @@ void AlignedBitmapOp(const uint8_t* left, int64_t
left_offset, const uint8_t* ri
left += left_offset / 8;
right += right_offset / 8;
out += out_offset / 8;
+ uint64_t outPopCount = 0;
for (int64_t i = 0; i < nbytes; ++i) {
out[i] = op(left[i], right[i]);
+ if (ComputeNewValidityCount) {
+ outPopCount += BitUtil::kBytePopcount[out[i]];
Review comment:
This extra line may disable compiler to vectorize the loop and cause big
performance regression.[1]
But from your benchmark result, there's at most 20% difference.
Would you list the test names in your benchmark result graph? BitmapAnd
related benchmark is about 9G/s on my test machine (no faster than most desktop
pc), looks your test result is about 400M/s.
[1] https://quick-bench.com/q/ko7aKpF3R2N8G2yqN3g5mUKgOPg
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]