cyb70289 commented on pull request #11458:
URL: https://github.com/apache/arrow/pull/11458#issuecomment-946408775


   A bit surprised that gcc is much slower (~0.5x) than clang in Add256 and 
Multiply256 tests on xeon gold 5218.
   No obvious difference is observed on arm64 neoverse n1 between gcc and clang.
   
   clang-10, xeon gold 5128
   ```
   
----------------------------------------------------------------------------------
   Benchmark                        Time             CPU   Iterations 
UserCounters...
   
----------------------------------------------------------------------------------
   FromString                     353 ns          353 ns      1985996 
items_per_second=17.011M/s
   ToString                       432 ns          432 ns      1621880 
items_per_second=13.8972M/s
   BinaryMathOpAdd128            23.1 ns         23.1 ns     30187024 
items_per_second=432.948M/s
   BinaryMathOpMultiply128       38.1 ns         38.1 ns     18345407 
items_per_second=262.42M/s
   BinaryMathOpDivide128          471 ns          471 ns      1499617 
items_per_second=21.2288M/s
   BinaryMathOpAdd256            36.3 ns         36.3 ns     19154265 
items_per_second=275.707M/s
   BinaryMathOpMultiply256        106 ns          106 ns      6580549 
items_per_second=93.995M/s
   BinaryMathOpDivide256          826 ns          826 ns       839848 
items_per_second=12.1013M/s
   BinaryMathOpAggregate          223 ns          223 ns      3138443 
items_per_second=44.8313M/s
   BinaryCompareOp               34.7 ns         34.7 ns     20177291 
items_per_second=288.247M/s
   BinaryCompareOpConstant       31.9 ns         31.9 ns     21965275 
items_per_second=313.771M/s
   UnaryOp                       24.7 ns         24.7 ns     28309678 
items_per_second=404.377M/s
   Constants                     8.18 ns         8.18 ns     85797082 
items_per_second=244.593M/s
   BinaryBitOp                   23.0 ns         23.0 ns     30372604 
items_per_second=434.095M/s
   ```
   
   gcc-9.3, xeon gold 5218
   ```
   
----------------------------------------------------------------------------------
   Benchmark                        Time             CPU   Iterations 
UserCounters...
   
----------------------------------------------------------------------------------
   FromString                     306 ns          306 ns      2287193 
items_per_second=19.6159M/s
   ToString                       366 ns          366 ns      1911365 
items_per_second=16.38M/s
   BinaryMathOpAdd128            25.8 ns         25.8 ns     27128130 
items_per_second=387.526M/s
   BinaryMathOpMultiply128       50.9 ns         50.9 ns     13754093 
items_per_second=196.502M/s
   BinaryMathOpDivide128          534 ns          534 ns      1311098 
items_per_second=18.7322M/s
   BinaryMathOpAdd256            89.9 ns         89.9 ns      7778370 
items_per_second=111.177M/s
   BinaryMathOpMultiply256        237 ns          237 ns      2959045 
items_per_second=42.2758M/s
   BinaryMathOpDivide256          833 ns          833 ns       840766 
items_per_second=12.0099M/s
   BinaryMathOpAggregate          259 ns          259 ns      2707564 
items_per_second=38.6741M/s
   BinaryCompareOp               32.2 ns         32.2 ns     21941410 
items_per_second=310.521M/s
   BinaryCompareOpConstant       28.5 ns         28.5 ns     24753989 
items_per_second=350.707M/s
   UnaryOp                       25.8 ns         25.8 ns     27129232 
items_per_second=387.547M/s
   Constants                     7.88 ns         7.88 ns     88784342 
items_per_second=253.659M/s
   BinaryBitOp                   26.9 ns         26.9 ns     26023211 
items_per_second=371.79M/s
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to