nevi-me commented on PR #1830:
URL: https://github.com/apache/arrow-rs/pull/1830#issuecomment-1153235877

   M1 Pro
   
   * `nosimd` = nightly master with `RUSTFLAGS="-Ctarget-cpu=native"`
   * `simd` = nightly master with above and `--features simd`
   * `pr1830` = nightly on this PR with the above
   * `stable` = stable on this PR with `target-cpu=native` (didn't check master 
as there should be no difference)
   * `stablenoflags` = stable on this PR with no flags
   
   ```
   group                                nosimd                                 
pr1830                                 simd                                   
stable                                 stablenoflags
   -----                                ------                                 
------                                 ----                                   
------                                 -------------
   buffer_binary_ops/and                1.01    153.2±0.97ns    93.4 GB/sec    
1.03    157.6±8.65ns    90.8 GB/sec    1.25    189.8±1.68ns    75.4 GB/sec    
1.03   156.6±21.14ns    91.4 GB/sec    1.00    152.4±1.84ns    93.9 GB/sec
   buffer_binary_ops/and_with_offset    1.02    536.2±4.21ns    26.7 GB/sec    
1.02    533.9±2.03ns    26.8 GB/sec    1.03    541.7±4.97ns    26.4 GB/sec    
1.00    524.5±2.23ns    27.3 GB/sec    1.01    527.3±1.98ns    27.1 GB/sec
   buffer_binary_ops/or                 1.04    156.3±7.65ns    91.5 GB/sec    
1.02    154.2±0.76ns    92.8 GB/sec    1.31   197.4±47.84ns    72.5 GB/sec    
1.00    150.7±2.41ns    94.9 GB/sec    1.01    151.5±1.14ns    94.4 GB/sec
   buffer_binary_ops/or_with_offset     1.02    539.7±4.22ns    26.5 GB/sec    
1.02    539.1±2.84ns    26.5 GB/sec    1.04    549.1±4.72ns    26.1 GB/sec    
1.00    530.2±3.16ns    27.0 GB/sec    1.01    532.8±6.84ns    26.8 GB/sec
   buffer_unary_ops/not                 1.17   197.0±43.99ns    48.4 GB/sec    
1.09    183.2±3.57ns    52.1 GB/sec    1.00    168.7±3.64ns    56.5 GB/sec    
1.08    182.1±4.18ns    52.4 GB/sec    1.08    181.7±3.15ns    52.5 GB/sec
   buffer_unary_ops/not_with_offset     1.00    363.8±6.18ns    26.2 GB/sec    
1.02    369.1±2.30ns    25.8 GB/sec    1.02    368.6±1.09ns    25.9 GB/sec    
1.02   368.8±23.20ns    25.9 GB/sec    1.00    362.6±1.81ns    26.3 GB/sec
   ```
   
   A bit hard to interpret the results, but what I can see is that stable with 
a CPU target flag isn't much that better than without. That could make sense as 
there shouldn't be (m)any differences between ARMv8 CPUs that are supported, 
unlike x64 where there's all sorts of SIMD versions.
   
   Of the 6 results, 5 are fastest in stable, but not by a large variation 
compared to the other options, unlike what we saw with x64 on @tustvold 's 
results.
   
   Seeing as there's no regression, I'll merge this. Thanks @jhorstmann!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to