pitrou commented on PR #46789:
URL: https://github.com/apache/arrow/pull/46789#issuecomment-2995463508

   Local benchmark results on my AMD Ryzen 9 3900X CPU:
   ```
   
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   Non-regressions: (40)
   
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                                         benchmark       baseline      
contender  change %                                                             
                                                                                
                                          counters
     BM_ByteStreamSplitDecode_FLBA_Generic<2>/1024  4.020 GiB/sec 59.711 
GiB/sec  1385.469 {'family_index': 2, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitDecode_FLBA_Generic<2>/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 1463689}
    BM_ByteStreamSplitDecode_FLBA_Generic<2>/65536  4.057 GiB/sec 53.022 
GiB/sec  1206.800  {'family_index': 2, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitDecode_FLBA_Generic<2>/65536', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 23272}
    BM_ByteStreamSplitEncode_FLBA_Generic<2>/65536  4.690 GiB/sec  7.451 
GiB/sec    58.859  {'family_index': 7, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitEncode_FLBA_Generic<2>/65536', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 27316}
     BM_ByteStreamSplitEncode_FLBA_Generic<2>/1024  4.777 GiB/sec  7.398 
GiB/sec    54.878 {'family_index': 7, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitEncode_FLBA_Generic<2>/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 1749779}
      BM_ByteStreamSplitEncode_Double_Generic/1024  7.294 GiB/sec  8.597 
GiB/sec    17.874   {'family_index': 6, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitEncode_Double_Generic/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 654439}
          BM_ByteStreamSplitDecode_Float_Sse2/1024  7.816 GiB/sec  8.696 
GiB/sec    11.247     {'family_index': 14, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitDecode_Float_Sse2/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 1478390}
         BM_ByteStreamSplitEncode_Double_Sse2/1024  7.656 GiB/sec  8.466 
GiB/sec    10.575     {'family_index': 17, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitEncode_Double_Sse2/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 685857}
         BM_ByteStreamSplitDecode_Float_Sse2/65536  7.368 GiB/sec  8.091 
GiB/sec     9.813      {'family_index': 14, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitDecode_Float_Sse2/65536', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 21078}
        BM_ByteStreamSplitEncode_Double_Sse2/65536  7.303 GiB/sec  7.784 
GiB/sec     6.596     {'family_index': 17, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitEncode_Double_Sse2/65536', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 10413}
     BM_ByteStreamSplitEncode_Double_Generic/65536  7.498 GiB/sec  7.918 
GiB/sec     5.600   {'family_index': 6, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitEncode_Double_Generic/65536', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 10468}
        BM_ByteStreamSplitDecode_Double_Sse2/65536  8.489 GiB/sec  8.720 
GiB/sec     2.720     {'family_index': 15, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitDecode_Double_Sse2/65536', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 12255}
         BM_ByteStreamSplitDecode_Double_Sse2/1024  9.153 GiB/sec  9.301 
GiB/sec     1.619     {'family_index': 15, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitDecode_Double_Sse2/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 850920}
    BM_ByteStreamSplitEncode_FLBA_Generic<16>/1024  5.002 GiB/sec  5.059 
GiB/sec     1.142 {'family_index': 9, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitEncode_FLBA_Generic<16>/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 228207}
        BM_ByteStreamSplitDecode_Double_Avx2/65536 13.154 GiB/sec 13.270 
GiB/sec     0.886     {'family_index': 19, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitDecode_Double_Avx2/65536', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 18728}
          BM_ByteStreamSplitEncode_Float_Avx2/1024 12.957 GiB/sec 13.062 
GiB/sec     0.809     {'family_index': 20, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitEncode_Float_Avx2/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 2420490}
          BM_ByteStreamSplitDecode_Float_Avx2/1024 19.404 GiB/sec 19.527 
GiB/sec     0.634     {'family_index': 18, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitDecode_Float_Avx2/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 3547412}
         BM_ByteStreamSplitDecode_Double_Avx2/1024 13.644 GiB/sec 13.690 
GiB/sec     0.338    {'family_index': 19, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitDecode_Double_Avx2/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 1263795}
      BM_ByteStreamSplitEncode_Float_Generic/65536 13.313 GiB/sec 13.346 
GiB/sec     0.244    {'family_index': 5, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitEncode_Float_Generic/65536', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 38128}
       BM_ByteStreamSplitDecode_Float_Scalar/65536  4.038 GiB/sec  4.046 
GiB/sec     0.199    {'family_index': 10, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitDecode_Float_Scalar/65536', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 11709}
   BM_ByteStreamSplitEncode_FLBA_Generic<16>/65536  4.740 GiB/sec  4.748 
GiB/sec     0.164  {'family_index': 9, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitEncode_FLBA_Generic<16>/65536', 'repetitions': 
1, 'repetition_index': 0, 'threads': 1, 'iterations': 3418}
      BM_ByteStreamSplitEncode_Double_Scalar/65536  5.002 GiB/sec  4.999 
GiB/sec    -0.058    {'family_index': 13, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitEncode_Double_Scalar/65536', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 7119}
      BM_ByteStreamSplitDecode_Double_Scalar/65536  3.959 GiB/sec  3.956 
GiB/sec    -0.078    {'family_index': 11, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitDecode_Double_Scalar/65536', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 5572}
     BM_ByteStreamSplitEncode_FLBA_Generic<7>/1024  4.996 GiB/sec  4.993 
GiB/sec    -0.078  {'family_index': 8, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitEncode_FLBA_Generic<7>/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 522897}
         BM_ByteStreamSplitEncode_Float_Avx2/65536 13.052 GiB/sec 13.017 
GiB/sec    -0.269      {'family_index': 20, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitEncode_Float_Avx2/65536', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 37065}
         BM_ByteStreamSplitDecode_Float_Avx2/65536 19.584 GiB/sec 19.527 
GiB/sec    -0.293      {'family_index': 18, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitDecode_Float_Avx2/65536', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 55415}
       BM_ByteStreamSplitDecode_Double_Scalar/1024  4.056 GiB/sec  4.038 
GiB/sec    -0.436   {'family_index': 11, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitDecode_Double_Scalar/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 373026}
    BM_ByteStreamSplitEncode_FLBA_Generic<7>/65536  4.972 GiB/sec  4.940 
GiB/sec    -0.649   {'family_index': 8, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitEncode_FLBA_Generic<7>/65536', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 8158}
       BM_ByteStreamSplitDecode_Float_Generic/1024 19.630 GiB/sec 19.501 
GiB/sec    -0.657   {'family_index': 0, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitDecode_Float_Generic/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 3536526}
     BM_ByteStreamSplitDecode_FLBA_Generic<7>/1024  4.030 GiB/sec  4.000 
GiB/sec    -0.730  {'family_index': 3, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitDecode_FLBA_Generic<7>/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 422470}
     BM_ByteStreamSplitDecode_Double_Generic/65536 13.394 GiB/sec 13.293 
GiB/sec    -0.753   {'family_index': 1, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitDecode_Double_Generic/65536', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 19177}
        BM_ByteStreamSplitDecode_Float_Scalar/1024  4.044 GiB/sec  4.013 
GiB/sec    -0.767    {'family_index': 10, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitDecode_Float_Scalar/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 728013}
       BM_ByteStreamSplitEncode_Double_Scalar/1024  5.140 GiB/sec  5.098 
GiB/sec    -0.813   {'family_index': 13, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitEncode_Double_Scalar/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 465638}
      BM_ByteStreamSplitDecode_Double_Generic/1024 13.988 GiB/sec 13.866 
GiB/sec    -0.868  {'family_index': 1, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitDecode_Double_Generic/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 1285102}
        BM_ByteStreamSplitEncode_Float_Scalar/1024  5.112 GiB/sec  5.062 
GiB/sec    -0.975    {'family_index': 12, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitEncode_Float_Scalar/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 938352}
      BM_ByteStreamSplitDecode_Float_Generic/65536 20.086 GiB/sec 19.843 
GiB/sec    -1.209    {'family_index': 0, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitDecode_Float_Generic/65536', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 57481}
       BM_ByteStreamSplitEncode_Float_Scalar/65536  5.080 GiB/sec  5.014 
GiB/sec    -1.289    {'family_index': 12, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitEncode_Float_Scalar/65536', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 14556}
    BM_ByteStreamSplitDecode_FLBA_Generic<7>/65536  3.992 GiB/sec  3.939 
GiB/sec    -1.339   {'family_index': 3, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitDecode_FLBA_Generic<7>/65536', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 6540}
       BM_ByteStreamSplitEncode_Float_Generic/1024 13.396 GiB/sec 13.206 
GiB/sec    -1.422   {'family_index': 5, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitEncode_Float_Generic/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 2450449}
    BM_ByteStreamSplitDecode_FLBA_Generic<16>/1024  4.034 GiB/sec  3.900 
GiB/sec    -3.329 {'family_index': 4, 'per_family_instance_index': 0, 
'run_name': 'BM_ByteStreamSplitDecode_FLBA_Generic<16>/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 185049}
   BM_ByteStreamSplitDecode_FLBA_Generic<16>/65536  3.871 GiB/sec  3.689 
GiB/sec    -4.711  {'family_index': 4, 'per_family_instance_index': 1, 
'run_name': 'BM_ByteStreamSplitDecode_FLBA_Generic<16>/65536', 'repetitions': 
1, 'repetition_index': 0, 'threads': 1, 'iterations': 2777}
   
   
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   Regressions: (2)
   
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                                   benchmark       baseline     contender  
change %                                                                        
                                                                                
                           counters
    BM_ByteStreamSplitEncode_Float_Sse2/1024 11.229 GiB/sec 8.120 GiB/sec   
-27.690 {'family_index': 16, 'per_family_instance_index': 0, 'run_name': 
'BM_ByteStreamSplitEncode_Float_Sse2/1024', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 2049401}
   BM_ByteStreamSplitEncode_Float_Sse2/65536 11.278 GiB/sec 7.951 GiB/sec   
-29.498  {'family_index': 16, 'per_family_instance_index': 1, 'run_name': 
'BM_ByteStreamSplitEncode_Float_Sse2/65536', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 25535}
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to