cyb70289 commented on pull request #7135:
URL: https://github.com/apache/arrow/pull/7135#issuecomment-626133175


   Benchmark result of `release/arrow-bit-util-benchmark`, on AMD EPYC 7251, 
gcc-7.5
   
   Before this patch
   ```
   
-------------------------------------------------------------------------------------------------
   Benchmark                                       Time             CPU   
Iterations UserCounters...
   
-------------------------------------------------------------------------------------------------
   BitmapReader/8192                          131771 ns       131691 ns         
5350 bytes_per_second=118.649M/s
   VisitBits/8192                             126231 ns       126187 ns         
5545 bytes_per_second=123.825M/s
   VisitBitsUnrolled/8192                      46107 ns        46085 ns        
15374 bytes_per_second=339.045M/s
   BitmapWriter/8192                           89468 ns        89424 ns         
7154 bytes_per_second=87.365M/s
   FirstTimeBitmapWriter/8192                  78184 ns        78144 ns         
8994 bytes_per_second=99.9755M/s
   GenerateBits/8192                           84917 ns        84880 ns         
8358 bytes_per_second=92.0416M/s
   GenerateBitsUnrolled/8192                   42952 ns        42933 ns        
16229 bytes_per_second=181.971M/s
   CopyBitmapWithoutOffset/8192                  191 ns          191 ns      
3612121 bytes_per_second=39.9542G/s
   CopyBitmapWithOffset/8192                    8654 ns         8651 ns        
81364 bytes_per_second=903.041M/s
   BenchmarkBitmapAnd/32768/0                   4353 ns         4350 ns       
161617 bytes_per_second=7.01486G/s
   BenchmarkBitmapAnd/131072/0                 16896 ns        16887 ns        
41278 bytes_per_second=7.22851G/s
   BenchmarkBitmapAnd/32768/0                   4268 ns         4267 ns       
162600 bytes_per_second=7.1523G/s
   BenchmarkBitmapAnd/131072/0                 17009 ns        16998 ns        
41089 bytes_per_second=7.18163G/s
   
   [XXXXXXXXX: unaligned bitmap operation]
   BenchmarkBitmapAnd/32768/1                 552753 ns       552530 ns         
1271 bytes_per_second=56.558M/s
   BenchmarkBitmapAnd/131072/1               2230810 ns      2229574 ns         
 316 bytes_per_second=56.0645M/s
   BenchmarkBitmapAnd/32768/2                 565404 ns       565094 ns         
1266 bytes_per_second=55.3005M/s
   BenchmarkBitmapAnd/131072/2               2215504 ns      2214782 ns         
 303 bytes_per_second=56.439M/s
   
   BenchmarkBitmapVisitBitsetAnd/32768/0     1493021 ns      1492271 ns         
 469 bytes_per_second=20.9412M/s
   BenchmarkBitmapVisitBitsetAnd/131072/0    5974576 ns      5972560 ns         
 117 bytes_per_second=20.929M/s
   BenchmarkBitmapVisitBitsetAnd/32768/0     1492997 ns      1492431 ns         
 469 bytes_per_second=20.939M/s
   BenchmarkBitmapVisitBitsetAnd/131072/0    5977490 ns      5975855 ns         
 117 bytes_per_second=20.9175M/s
   BenchmarkBitmapVisitBitsetAnd/32768/1     1493233 ns      1492718 ns         
 469 bytes_per_second=20.935M/s
   BenchmarkBitmapVisitBitsetAnd/131072/1    5975924 ns      5973199 ns         
 117 bytes_per_second=20.9268M/s
   BenchmarkBitmapVisitBitsetAnd/32768/2     1492305 ns      1491810 ns         
 467 bytes_per_second=20.9477M/s
   BenchmarkBitmapVisitBitsetAnd/131072/2    5987022 ns      5984924 ns         
 117 bytes_per_second=20.8858M/s
   BenchmarkBitmapVisitUInt8And/32768/0        83420 ns        83384 ns         
8454 bytes_per_second=374.771M/s
   BenchmarkBitmapVisitUInt8And/131072/0      330943 ns       330824 ns         
2091 bytes_per_second=377.844M/s
   BenchmarkBitmapVisitUInt8And/32768/0        83640 ns        83612 ns         
8347 bytes_per_second=373.749M/s
   BenchmarkBitmapVisitUInt8And/131072/0      330971 ns       330857 ns         
2115 bytes_per_second=377.807M/s
   BenchmarkBitmapVisitUInt8And/32768/1       104218 ns       104185 ns         
6702 bytes_per_second=299.948M/s
   BenchmarkBitmapVisitUInt8And/131072/1      413367 ns       413210 ns         
1691 bytes_per_second=302.509M/s
   BenchmarkBitmapVisitUInt8And/32768/2       104633 ns       104591 ns         
6796 bytes_per_second=298.783M/s
   BenchmarkBitmapVisitUInt8And/131072/2      409852 ns       409681 ns         
1679 bytes_per_second=305.115M/s
   BenchmarkBitmapVisitUInt64And/32768/0        7205 ns         7202 ns        
97108 bytes_per_second=4.2372G/s
   BenchmarkBitmapVisitUInt64And/131072/0      28334 ns        28326 ns        
24630 bytes_per_second=4.30952G/s
   BenchmarkBitmapVisitUInt64And/32768/0        7205 ns         7202 ns        
97100 bytes_per_second=4.23729G/s
   BenchmarkBitmapVisitUInt64And/131072/0      28334 ns        28324 ns        
24693 bytes_per_second=4.30981G/s
   BenchmarkBitmapVisitUInt64And/32768/1       10736 ns        10732 ns        
65164 bytes_per_second=2.84349G/s
   BenchmarkBitmapVisitUInt64And/131072/1      36150 ns        36137 ns        
19364 bytes_per_second=3.378G/s
   BenchmarkBitmapVisitUInt64And/32768/2       10739 ns        10734 ns        
65218 bytes_per_second=2.84299G/s
   BenchmarkBitmapVisitUInt64And/131072/2      36137 ns        36127 ns        
19342 bytes_per_second=3.37896G/s
   ```
   
   After this patch
   ```
   
-------------------------------------------------------------------------------------------------
   Benchmark                                       Time             CPU   
Iterations UserCounters...
   
-------------------------------------------------------------------------------------------------
   BitmapReader/8192                          130861 ns       130803 ns         
5349 bytes_per_second=119.454M/s
   VisitBits/8192                             125543 ns       125498 ns         
5621 bytes_per_second=124.504M/s
   VisitBitsUnrolled/8192                      45482 ns        45463 ns        
15377 bytes_per_second=343.69M/s
   BitmapWriter/8192                           89930 ns        89889 ns         
7383 bytes_per_second=86.9129M/s
   FirstTimeBitmapWriter/8192                  77582 ns        77547 ns         
9009 bytes_per_second=100.745M/s
   GenerateBits/8192                           83244 ns        83210 ns         
8393 bytes_per_second=93.8886M/s
   GenerateBitsUnrolled/8192                   42691 ns        42673 ns        
15958 bytes_per_second=183.079M/s
   CopyBitmapWithoutOffset/8192                  191 ns          191 ns      
3661358 bytes_per_second=39.9303G/s
   CopyBitmapWithOffset/8192                    8582 ns         8578 ns        
81828 bytes_per_second=910.725M/s
   BenchmarkBitmapAnd/32768/0                   4078 ns         4077 ns       
171965 bytes_per_second=7.48507G/s
   BenchmarkBitmapAnd/131072/0                 17269 ns        17262 ns        
41975 bytes_per_second=7.07161G/s
   BenchmarkBitmapAnd/32768/0                   4084 ns         4082 ns       
171407 bytes_per_second=7.47578G/s
   BenchmarkBitmapAnd/131072/0                 16135 ns        16129 ns        
43454 bytes_per_second=7.5683G/s
   
   [XXXXXXXXX: unaligned bitmap operation]
   BenchmarkBitmapAnd/32768/1                   7304 ns         7301 ns        
95839 bytes_per_second=4.17989G/s
   BenchmarkBitmapAnd/131072/1                 28344 ns        28332 ns        
24730 bytes_per_second=4.3085G/s
   BenchmarkBitmapAnd/32768/2                   7310 ns         7307 ns        
95861 bytes_per_second=4.17663G/s
   BenchmarkBitmapAnd/131072/2                 28289 ns        28274 ns        
24733 bytes_per_second=4.31734G/s
   
   BenchmarkBitmapVisitBitsetAnd/32768/0     1495320 ns      1494883 ns         
 468 bytes_per_second=20.9047M/s
   BenchmarkBitmapVisitBitsetAnd/131072/0    5986673 ns      5984492 ns         
 117 bytes_per_second=20.8873M/s
   BenchmarkBitmapVisitBitsetAnd/32768/0     1495659 ns      1495153 ns         
 468 bytes_per_second=20.9009M/s
   BenchmarkBitmapVisitBitsetAnd/131072/0    5985930 ns      5983460 ns         
 117 bytes_per_second=20.8909M/s
   BenchmarkBitmapVisitBitsetAnd/32768/1     1495922 ns      1495411 ns         
 468 bytes_per_second=20.8973M/s
   BenchmarkBitmapVisitBitsetAnd/131072/1    5981635 ns      5979222 ns         
 117 bytes_per_second=20.9057M/s
   BenchmarkBitmapVisitBitsetAnd/32768/2     1494916 ns      1494250 ns         
 468 bytes_per_second=20.9135M/s
   BenchmarkBitmapVisitBitsetAnd/131072/2    5986896 ns      5984091 ns         
 117 bytes_per_second=20.8887M/s
   BenchmarkBitmapVisitUInt8And/32768/0        83954 ns        83920 ns         
8442 bytes_per_second=372.377M/s
   BenchmarkBitmapVisitUInt8And/131072/0      333061 ns       332921 ns         
2081 bytes_per_second=375.465M/s
   BenchmarkBitmapVisitUInt8And/32768/0        84111 ns        84071 ns         
8339 bytes_per_second=371.709M/s
   BenchmarkBitmapVisitUInt8And/131072/0      331619 ns       331460 ns         
2104 bytes_per_second=377.119M/s
   BenchmarkBitmapVisitUInt8And/32768/1       104625 ns       104585 ns         
6699 bytes_per_second=298.801M/s
   BenchmarkBitmapVisitUInt8And/131072/1      416265 ns       416111 ns         
1680 bytes_per_second=300.401M/s
   BenchmarkBitmapVisitUInt8And/32768/2       104083 ns       104044 ns         
6794 bytes_per_second=300.353M/s
   BenchmarkBitmapVisitUInt8And/131072/2      410391 ns       410236 ns         
1679 bytes_per_second=304.703M/s
   BenchmarkBitmapVisitUInt64And/32768/0        7215 ns         7213 ns        
97053 bytes_per_second=4.2309G/s
   BenchmarkBitmapVisitUInt64And/131072/0      28384 ns        28371 ns        
24663 bytes_per_second=4.30262G/s
   BenchmarkBitmapVisitUInt64And/32768/0        7213 ns         7210 ns        
97340 bytes_per_second=4.23244G/s
   BenchmarkBitmapVisitUInt64And/131072/0      28412 ns        28401 ns        
24618 bytes_per_second=4.29805G/s
   BenchmarkBitmapVisitUInt64And/32768/1       10703 ns        10698 ns        
65420 bytes_per_second=2.85252G/s
   BenchmarkBitmapVisitUInt64And/131072/1      36249 ns        36235 ns        
19236 bytes_per_second=3.36889G/s
   BenchmarkBitmapVisitUInt64And/32768/2       10704 ns        10700 ns        
65459 bytes_per_second=2.85207G/s
   BenchmarkBitmapVisitUInt64And/131072/2      36243 ns        36228 ns        
19315 bytes_per_second=3.36946G/s


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to