This is an automated email from the ASF dual-hosted git repository.
yibocai pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/master by this push:
new 56e6caf07d ARROW-17305: [C++] Avoid spending time in popcount in
BitmapAnd benchmark (#13794)
56e6caf07d is described below
commit 56e6caf07d77a4d4c79a20c558c2618efe7de830
Author: Antoine Pitrou <[email protected]>
AuthorDate: Fri Aug 5 05:00:46 2022 +0200
ARROW-17305: [C++] Avoid spending time in popcount in BitmapAnd benchmark
(#13794)
This was artificially limiting the reported performance of BitmapAnd.
Before:
```
--------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
UserCounters...
--------------------------------------------------------------------------------------
BenchmarkBitmapAnd/32768/0 1708 ns 1708 ns 408579
bytes_per_second=17.8726G/s
BenchmarkBitmapAnd/131072/0 6968 ns 6965 ns 102223
bytes_per_second=17.5262G/s
BenchmarkBitmapAnd/32768/1 3982 ns 3981 ns 175136
bytes_per_second=7.66574G/s
BenchmarkBitmapAnd/131072/1 15574 ns 15569 ns 44988
bytes_per_second=7.8404G/s
BenchmarkBitmapAnd/32768/2 3999 ns 3998 ns 175021
bytes_per_second=7.63248G/s
BenchmarkBitmapAnd/131072/2 15589 ns 15585 ns 44844
bytes_per_second=7.83234G/s
```
After:
```
--------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
UserCounters...
--------------------------------------------------------------------------------------
BenchmarkBitmapAnd/32768/0 732 ns 732 ns 967465
bytes_per_second=41.6736G/s
BenchmarkBitmapAnd/131072/0 3105 ns 3105 ns 229726
bytes_per_second=39.3198G/s
BenchmarkBitmapAnd/32768/1 2913 ns 2913 ns 240233
bytes_per_second=10.4774G/s
BenchmarkBitmapAnd/131072/1 11528 ns 11526 ns 60865
bytes_per_second=10.5912G/s
BenchmarkBitmapAnd/32768/2 2924 ns 2924 ns 236873
bytes_per_second=10.4378G/s
BenchmarkBitmapAnd/131072/2 11552 ns 11550 ns 60619
bytes_per_second=10.5691G/s
```
(I didn't check, but the compiler here probably auto-vectorizes the aligned
code path)
Authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Yibo Cai <[email protected]>
---
cpp/src/arrow/util/bit_util_benchmark.cc | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/cpp/src/arrow/util/bit_util_benchmark.cc
b/cpp/src/arrow/util/bit_util_benchmark.cc
index 8e95d01462..3bcb4ceea6 100644
--- a/cpp/src/arrow/util/bit_util_benchmark.cc
+++ b/cpp/src/arrow/util/bit_util_benchmark.cc
@@ -150,9 +150,7 @@ static void BenchmarkAndImpl(benchmark::State& state,
DoAnd&& do_and) {
for (auto _ : state) {
do_and({bitmap_1, bitmap_2}, &bitmap_3);
- auto total =
- internal::CountSetBits(bitmap_3.data(), bitmap_3.offset(),
bitmap_3.length());
- benchmark::DoNotOptimize(total);
+ benchmark::ClobberMemory();
}
state.SetBytesProcessed(state.iterations() * nbytes);
}