mapleFU commented on PR #41690:
URL: https://github.com/apache/arrow/pull/41690#issuecomment-2116627942

   On my M1 MacOS with -O3:
   
   Current:
   ```
   
-------------------------------------------------------------------------------------------------
   Benchmark                                       Time             CPU   
Iterations UserCounters...
   
-------------------------------------------------------------------------------------------------
   ReferenceNaiveBitmapReader/8192             85979 ns        85441 ns         
7947 bytes_per_second=182.874M/s
   BitmapReader/8192                           67869 ns        64637 ns        
11012 bytes_per_second=241.734M/s
   BitmapUInt64Reader/8192                       678 ns          670 ns      
1051114 bytes_per_second=11.386G/s
   BitRunReader/-1                              9423 ns         9399 ns        
74456 bytes_per_second=51.95M/s
   BitRunReader/0                                148 ns          148 ns      
4539942 bytes_per_second=3.2326G/s
   BitRunReader/10                              1732 ns         1731 ns       
407401 bytes_per_second=282.108M/s
   BitRunReader/25                              3471 ns         3470 ns       
202346 bytes_per_second=140.726M/s
   BitRunReader/50                              4864 ns         4857 ns       
144606 bytes_per_second=100.526M/s
   BitRunReader/60                              4547 ns         4545 ns       
153297 bytes_per_second=107.438M/s
   BitRunReader/75                              3610 ns         3607 ns       
189480 bytes_per_second=135.387M/s
   BitRunReader/99                               390 ns          390 ns      
1796424 bytes_per_second=1.22331G/s
   BitRunReaderLinear/-1                        5802 ns         5796 ns       
121120 bytes_per_second=84.2382M/s
   BitRunReaderLinear/0                         2625 ns         2624 ns       
265712 bytes_per_second=186.104M/s
   BitRunReaderLinear/10                        3153 ns         3152 ns       
221187 bytes_per_second=154.922M/s
   BitRunReaderLinear/25                        3916 ns         3914 ns       
167117 bytes_per_second=124.75M/s
   BitRunReaderLinear/50                        4402 ns         4401 ns       
153699 bytes_per_second=110.958M/s
   BitRunReaderLinear/60                        4421 ns         4416 ns       
158283 bytes_per_second=110.568M/s
   BitRunReaderLinear/75                        3962 ns         3961 ns       
176550 bytes_per_second=123.271M/s
   BitRunReaderLinear/99                        3112 ns         3068 ns       
234100 bytes_per_second=159.176M/s
   SetBitRunReader/-1                           9961 ns         9956 ns        
70916 bytes_per_second=49.0416M/s
   SetBitRunReader/0                            36.0 ns         35.9 ns     
19570128 bytes_per_second=13.2686G/s
   SetBitRunReader/10                           1744 ns         1743 ns       
398788 bytes_per_second=280.208M/s
   SetBitRunReader/25                           4255 ns         4205 ns       
166620 bytes_per_second=116.132M/s
   SetBitRunReader/50                           5598 ns         5575 ns       
124327 bytes_per_second=87.5788M/s
   SetBitRunReader/60                           5371 ns         5348 ns       
131510 bytes_per_second=91.2951M/s
   SetBitRunReader/75                           4145 ns         4143 ns       
168118 bytes_per_second=117.851M/s
   SetBitRunReader/99                            354 ns          353 ns      
1978715 bytes_per_second=1.34923G/s
   ReverseSetBitRunReader/-1                    8421 ns         8417 ns        
83486 bytes_per_second=58.0115M/s
   ReverseSetBitRunReader/0                     33.4 ns         33.4 ns     
21037889 bytes_per_second=14.2938G/s
   ReverseSetBitRunReader/10                    1513 ns         1512 ns       
467855 bytes_per_second=322.861M/s
   ReverseSetBitRunReader/25                    3451 ns         3448 ns       
202580 bytes_per_second=141.593M/s
   ReverseSetBitRunReader/50                    4649 ns         4617 ns       
149863 bytes_per_second=105.76M/s
   ReverseSetBitRunReader/60                    4513 ns         4501 ns       
157672 bytes_per_second=108.489M/s
   ReverseSetBitRunReader/75                    3580 ns         3573 ns       
194630 bytes_per_second=136.648M/s
   ReverseSetBitRunReader/99                     332 ns          323 ns      
2185390 bytes_per_second=1.4759G/s
   VisitBits/8192                              62529 ns        62421 ns        
11277 bytes_per_second=250.316M/s
   VisitBitsUnrolled/8192                      10295 ns        10290 ns        
66433 bytes_per_second=1.48293G/s
   SetBitsTo/2                                  3.44 ns         3.44 ns    
200494362 bytes_per_second=554.859M/s
   SetBitsTo/16                                 6.87 ns         6.87 ns    
100227660 bytes_per_second=2.16862G/s
   SetBitsTo/1024                               12.2 ns         12.2 ns     
56052465 bytes_per_second=78.161G/s
   SetBitsTo/131072                             2998 ns         2997 ns       
274108 bytes_per_second=40.7337G/s
   ReferenceNaiveBitmapWriter/8192            162284 ns       162221 ns         
4306 bytes_per_second=48.1595M/s
   BitmapWriter/8192                           63899 ns        63868 ns        
10919 bytes_per_second=122.322M/s
   FirstTimeBitmapWriter/8192                  66944 ns        66918 ns        
10509 bytes_per_second=116.747M/s
   GenerateBits/8192                           66600 ns        66527 ns        
10655 bytes_per_second=117.433M/s
   GenerateBitsUnrolled/8192                   63659 ns        63637 ns        
11028 bytes_per_second=122.767M/s
   CopyBitmapWithoutOffset/8192                  113 ns          113 ns      
6564880 bytes_per_second=67.325G/s
   CopyBitmapWithOffset/8192                     590 ns          590 ns      
1185055 bytes_per_second=12.9281G/s
   CopyBitmapWithOffsetBoth/8192                1427 ns         1426 ns       
496930 bytes_per_second=5.35158G/s
   BitmapEqualsWithoutOffset/8192                225 ns          225 ns      
2946649 bytes_per_second=33.9253G/s
   BitmapEqualsWithOffset/8192                   927 ns          926 ns       
755532 bytes_per_second=8.23875G/s
   BenchmarkBitmapAnd/32768/0                    507 ns          506 ns      
1000000 bytes_per_second=60.3348G/s
   BenchmarkBitmapAnd/131072/0                  3777 ns         3775 ns       
206341 bytes_per_second=32.3397G/s
   BenchmarkBitmapAnd/32768/1                   4190 ns         4189 ns       
167617 bytes_per_second=7.28547G/s
   BenchmarkBitmapAnd/131072/1                 15810 ns        15809 ns        
44006 bytes_per_second=7.72164G/s
   BenchmarkBitmapAnd/32768/2                   4186 ns         4184 ns       
168228 bytes_per_second=7.29366G/s
   BenchmarkBitmapAnd/131072/2                 16027 ns        16024 ns        
44225 bytes_per_second=7.61801G/s
   BenchmarkBitmapVisitBitsetAnd/32768/0      635335 ns       634998 ns         
1108 bytes_per_second=49.2127M/s
   BenchmarkBitmapVisitBitsetAnd/131072/0    2509032 ns      2507096 ns         
 271 bytes_per_second=49.8585M/s
   BenchmarkBitmapVisitBitsetAnd/32768/1      637952 ns       628201 ns         
1133 bytes_per_second=49.7452M/s
   BenchmarkBitmapVisitBitsetAnd/131072/1    2481080 ns      2480039 ns         
 281 bytes_per_second=50.4024M/s
   BenchmarkBitmapVisitBitsetAnd/32768/2      626262 ns       625913 ns         
1130 bytes_per_second=49.927M/s
   BenchmarkBitmapVisitBitsetAnd/131072/2    2501350 ns      2500163 ns         
 282 bytes_per_second=49.9967M/s
   BenchmarkBitmapVisitUInt8And/32768/0        19113 ns        19104 ns        
36615 bytes_per_second=1.59748G/s
   BenchmarkBitmapVisitUInt8And/131072/0       77321 ns        77252 ns         
8394 bytes_per_second=1.58016G/s
   BenchmarkBitmapVisitUInt8And/32768/1        25639 ns        25628 ns        
27275 bytes_per_second=1.19078G/s
   BenchmarkBitmapVisitUInt8And/131072/1      106034 ns       105291 ns         
6568 bytes_per_second=1.15936G/s
   BenchmarkBitmapVisitUInt8And/32768/2        25685 ns        25676 ns        
27225 bytes_per_second=1.18855G/s
   BenchmarkBitmapVisitUInt8And/131072/2      102330 ns       102292 ns         
6874 bytes_per_second=1.19335G/s
   BenchmarkBitmapVisitUInt64And/32768/0        1985 ns         1983 ns       
356961 bytes_per_second=15.3931G/s
   BenchmarkBitmapVisitUInt64And/131072/0       7849 ns         7845 ns        
87993 bytes_per_second=15.5596G/s
   BenchmarkBitmapVisitUInt64And/32768/1        4072 ns         4066 ns       
172934 bytes_per_second=7.50486G/s
   BenchmarkBitmapVisitUInt64And/131072/1      14624 ns        14617 ns        
47750 bytes_per_second=8.3513G/s
   BenchmarkBitmapVisitUInt64And/32768/2        4046 ns         4044 ns       
173114 bytes_per_second=7.54624G/s
   BenchmarkBitmapVisitUInt64And/131072/2      14608 ns        14603 ns        
47745 bytes_per_second=8.35953G/s
   ```
   
   Before:
   
   ```
   BitRunReader/-1                              9851 ns         9778 ns        
72564 bytes_per_second=49.9364M/s
   BitRunReader/0                                151 ns          150 ns      
4713773 bytes_per_second=3.18241G/s
   BitRunReader/10                              1862 ns         1810 ns       
399247 bytes_per_second=269.696M/s
   BitRunReader/25                              3558 ns         3552 ns       
197438 bytes_per_second=137.484M/s
   BitRunReader/50                              5051 ns         5028 ns       
140890 bytes_per_second=97.1205M/s
   BitRunReader/60                              4731 ns         4695 ns       
147650 bytes_per_second=104.005M/s
   BitRunReader/75                              3779 ns         3761 ns       
188246 bytes_per_second=129.834M/s
   BitRunReader/99                               400 ns          399 ns      
1750416 bytes_per_second=1.19611G/s
   BitRunReaderLinear/-1                        5959 ns         5921 ns       
118993 bytes_per_second=82.4684M/s
   BitRunReaderLinear/0                         2868 ns         2817 ns       
251639 bytes_per_second=173.309M/s
   BitRunReaderLinear/10                        3318 ns         3299 ns       
210512 bytes_per_second=148.011M/s
   BitRunReaderLinear/25                        4238 ns         4160 ns       
171548 bytes_per_second=117.381M/s
   BitRunReaderLinear/50                        4799 ns         4765 ns       
146327 bytes_per_second=102.473M/s
   BitRunReaderLinear/60                        6237 ns         5137 ns       
148266 bytes_per_second=95.0566M/s
   BitRunReaderLinear/75                        4166 ns         4124 ns       
163251 bytes_per_second=118.391M/s
   BitRunReaderLinear/99                        3194 ns         3117 ns       
225191 bytes_per_second=156.63M/s
   SetBitRunReader/-1                          10720 ns        10406 ns        
67369 bytes_per_second=46.9242M/s
   SetBitRunReader/0                            38.7 ns         37.2 ns     
18975949 bytes_per_second=12.8012G/s
   SetBitRunReader/10                           1864 ns         1801 ns       
388658 bytes_per_second=271.055M/s
   SetBitRunReader/25                           4253 ns         4179 ns       
168497 bytes_per_second=116.832M/s
   SetBitRunReader/50                           5745 ns         5646 ns       
123732 bytes_per_second=86.49M/s
   SetBitRunReader/60                           5588 ns         5401 ns       
128778 bytes_per_second=90.401M/s
   SetBitRunReader/75                           4576 ns         4372 ns       
161457 bytes_per_second=111.672M/s
   SetBitRunReader/99                            384 ns          372 ns      
1874535 bytes_per_second=1.28167G/s
   ReverseSetBitRunReader/-1                    9195 ns         8838 ns        
78374 bytes_per_second=55.2473M/s
   ReverseSetBitRunReader/0                     38.0 ns         35.7 ns     
20435510 bytes_per_second=13.3507G/s
   ReverseSetBitRunReader/10                    1715 ns         1609 ns       
415295 bytes_per_second=303.425M/s
   ReverseSetBitRunReader/25                    4044 ns         3728 ns       
188133 bytes_per_second=130.966M/s
   ReverseSetBitRunReader/50                    5226 ns         4945 ns       
140588 bytes_per_second=98.7392M/s
   ReverseSetBitRunReader/60                    4861 ns         4712 ns       
148026 bytes_per_second=103.635M/s
   ReverseSetBitRunReader/75                    3791 ns         3745 ns       
180838 bytes_per_second=130.373M/s
   ReverseSetBitRunReader/99                     336 ns          332 ns      
2092769 bytes_per_second=1.43647G/s
   VisitBits/8192                              68772 ns        65844 ns        
10642 bytes_per_second=237.304M/s
   VisitBitsUnrolled/8192                      10995 ns        10838 ns        
64390 bytes_per_second=1.40794G/s
   SetBitsTo/2                                  3.71 ns         3.64 ns    
194128446 bytes_per_second=523.67M/s
   SetBitsTo/16                                 7.23 ns         7.18 ns     
95434157 bytes_per_second=2.07555G/s
   SetBitsTo/1024                               13.3 ns         12.9 ns     
53696217 bytes_per_second=74.0268G/s
   SetBitsTo/131072                             2284 ns         2254 ns       
311704 bytes_per_second=54.1618G/s
   ReferenceNaiveBitmapWriter/8192            179032 ns       170889 ns         
4023 bytes_per_second=45.7167M/s
   BitmapWriter/8192                           66093 ns        65720 ns        
10579 bytes_per_second=118.876M/s
   FirstTimeBitmapWriter/8192                  68766 ns        68523 ns        
10242 bytes_per_second=114.013M/s
   GenerateBits/8192                           67005 ns        66774 ns        
10463 bytes_per_second=116.999M/s
   GenerateBitsUnrolled/8192                   64922 ns        64678 ns        
10681 bytes_per_second=120.791M/s
   CopyBitmapWithoutOffset/8192                  123 ns          115 ns      
6582783 bytes_per_second=66.6195G/s
   CopyBitmapWithOffset/8192                     611 ns          606 ns      
1169747 bytes_per_second=12.5907G/s
   CopyBitmapWithOffsetBoth/8192                1475 ns         1471 ns       
465045 bytes_per_second=5.18633G/s
   BitmapEqualsWithoutOffset/8192                232 ns          231 ns      
3036244 bytes_per_second=33.0157G/s
   BitmapEqualsWithOffset/8192                   962 ns          960 ns       
736214 bytes_per_second=7.94465G/s
   BenchmarkBitmapAnd/32768/0                    506 ns          506 ns      
1371554 bytes_per_second=60.3483G/s
   BenchmarkBitmapAnd/131072/0                  3266 ns         3265 ns       
203884 bytes_per_second=37.3864G/s
   BenchmarkBitmapAnd/32768/1                   4380 ns         4355 ns       
164819 bytes_per_second=7.0071G/s
   BenchmarkBitmapAnd/131072/1                 16415 ns        16362 ns        
43104 bytes_per_second=7.46055G/s
   BenchmarkBitmapAnd/32768/2                   4519 ns         4369 ns       
163878 bytes_per_second=6.98446G/s
   BenchmarkBitmapAnd/131072/2                 18100 ns        17124 ns        
40973 bytes_per_second=7.1284G/s
   BenchmarkBitmapVisitBitsetAnd/32768/0      675592 ns       660185 ns         
1015 bytes_per_second=47.3352M/s
   BenchmarkBitmapVisitBitsetAnd/131072/0    3438938 ns      2690124 ns         
 267 bytes_per_second=46.4663M/s
   BenchmarkBitmapVisitBitsetAnd/32768/1      718345 ns       665442 ns         
1010 bytes_per_second=46.9613M/s
   BenchmarkBitmapVisitBitsetAnd/131072/1    2643298 ns      2616118 ns         
 263 bytes_per_second=47.7807M/s
   BenchmarkBitmapVisitBitsetAnd/32768/2      687543 ns       659680 ns         
1075 bytes_per_second=47.3715M/s
   BenchmarkBitmapVisitBitsetAnd/131072/2    2743586 ns      2629706 ns         
 265 bytes_per_second=47.5338M/s
   BenchmarkBitmapVisitUInt8And/32768/0        23392 ns        21112 ns        
34389 bytes_per_second=1.44551G/s
   BenchmarkBitmapVisitUInt8And/131072/0      112039 ns        87872 ns         
7955 bytes_per_second=1.38919G/s
   BenchmarkBitmapVisitUInt8And/32768/1        31099 ns        28303 ns        
24534 bytes_per_second=1104.11M/s
   BenchmarkBitmapVisitUInt8And/131072/1      119082 ns       110789 ns         
6168 bytes_per_second=1.10182G/s
   BenchmarkBitmapVisitUInt8And/32768/2        27608 ns        27186 ns        
25909 bytes_per_second=1.12254G/s
   BenchmarkBitmapVisitUInt8And/131072/2      110701 ns       108126 ns         
6377 bytes_per_second=1.12896G/s
   BenchmarkBitmapVisitUInt64And/32768/0        2084 ns         2075 ns       
332810 bytes_per_second=14.7093G/s
   BenchmarkBitmapVisitUInt64And/131072/0       9174 ns         8683 ns        
82603 bytes_per_second=14.0586G/s
   BenchmarkBitmapVisitUInt64And/32768/1        4329 ns         4275 ns       
163332 bytes_per_second=7.13794G/s
   BenchmarkBitmapVisitUInt64And/131072/1      15776 ns        15315 ns        
46095 bytes_per_second=7.97043G/s
   BenchmarkBitmapVisitUInt64And/32768/2        4325 ns         4256 ns       
166594 bytes_per_second=7.17106G/s
   BenchmarkBitmapVisitUInt64And/131072/2      15706 ns        15396 ns        
46283 bytes_per_second=7.92854G/s
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to