zanmato1984 commented on PR #43832:
URL: https://github.com/apache/arrow/pull/43832#issuecomment-2322961505

   This is on my other desktop (Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz, 
Coffee Lake), similar symptom (possibly because it is also Coffee Lake as my 
MPB).
   
   The scalar version:
   ```
   ARROW_USER_SIMD_LEVEL=NONE ./arrow-acero-hash-join-benchmark 
--benchmark_filter="BM_RowArray"
   2024-09-01T00:32:49+08:00
   Running ./arrow-acero-hash-join-benchmark
   Run on (8 X 4900 MHz CPU s)
   CPU Caches:
     L1 Data 32 KiB (x8)
     L1 Instruction 32 KiB (x8)
     L2 Unified 256 KiB (x8)
     L3 Unified 12288 KiB (x1)
   Load Average: 0.46, 3.08, 2.34
   ***WARNING*** CPU scaling is enabled, the benchmark real time measurements 
may be noisy and will incur extra overhead.
   
-----------------------------------------------------------------------------------------------------------------------------------------------------------
   Benchmark                                                                    
                             Time             CPU   Iterations UserCounters...
   
-----------------------------------------------------------------------------------------------------------------------------------------------------------
   BM_RowArray_Decode/"boolean"                                                 
                        345809 ns       345761 ns         1896 
rows/sec=189.538M/s
   BM_RowArray_Decode/"int8"                                                    
                        267577 ns       267553 ns         2678 
rows/sec=244.942M/s
   BM_RowArray_Decode/"int16"                                                   
                        237106 ns       237094 ns         2872 
rows/sec=276.409M/s
   BM_RowArray_Decode/"int32"                                                   
                        243701 ns       243697 ns         2874 
rows/sec=268.92M/s
   BM_RowArray_Decode/"int64"                                                   
                        239891 ns       239886 ns         2709 
rows/sec=273.192M/s
   BM_RowArray_DecodeFixedSizeBinary/fixed_size:3                               
                        316511 ns       316471 ns         2260 
rows/sec=207.081M/s
   BM_RowArray_DecodeFixedSizeBinary/fixed_size:5                               
                        310797 ns       310759 ns         2165 
rows/sec=210.887M/s
   BM_RowArray_DecodeFixedSizeBinary/fixed_size:6                               
                        324059 ns       324020 ns         2251 
rows/sec=202.256M/s
   BM_RowArray_DecodeFixedSizeBinary/fixed_size:7                               
                        311799 ns       311753 ns         2244 
rows/sec=210.214M/s
   BM_RowArray_DecodeFixedSizeBinary/fixed_size:9                               
                        364401 ns       364346 ns         2016 
rows/sec=179.87M/s
   BM_RowArray_DecodeFixedSizeBinary/fixed_size:16                              
                        349918 ns       349868 ns         1997 
rows/sec=187.313M/s
   BM_RowArray_DecodeFixedSizeBinary/fixed_size:42                              
                        507058 ns       506962 ns         1427 
rows/sec=129.27M/s
   BM_RowArray_DecodeBinary/max_length:32                                       
                       1261872 ns      1261465 ns          554 
rows/sec=51.9515M/s
   BM_RowArray_DecodeBinary/max_length:64                                       
                       1585243 ns      1584698 ns          462 
rows/sec=41.3549M/s
   BM_RowArray_DecodeBinary/max_length:128                                      
                       1822727 ns      1822343 ns          384 
rows/sec=35.962M/s
   
BM_RowArray_DecodeOneOfColumns/"fixed_length_row:{boolean,int32,fixed_size_binary(64)}"/column:0
     379210 ns       379150 ns         1843 rows/sec=172.847M/s
   
BM_RowArray_DecodeOneOfColumns/"fixed_length_row:{boolean,int32,fixed_size_binary(64)}"/column:1
     275680 ns       275657 ns         2693 rows/sec=237.741M/s
   
BM_RowArray_DecodeOneOfColumns/"fixed_length_row:{boolean,int32,fixed_size_binary(64)}"/column:2
     599291 ns       599291 ns         1257 rows/sec=109.354M/s
   
BM_RowArray_DecodeOneOfColumns/"var_length_row:{boolean,int32,utf8,utf8}"/column:0
                   506824 ns       506710 ns         1376 rows/sec=129.334M/s
   
BM_RowArray_DecodeOneOfColumns/"var_length_row:{boolean,int32,utf8,utf8}"/column:1
                   360611 ns       360579 ns         2123 rows/sec=181.75M/s
   
BM_RowArray_DecodeOneOfColumns/"var_length_row:{boolean,int32,utf8,utf8}"/column:2
                  1182248 ns      1181939 ns          603 rows/sec=55.447M/s
   
BM_RowArray_DecodeOneOfColumns/"var_length_row:{boolean,int32,utf8,utf8}"/column:3
                  1395220 ns      1394817 ns          529 rows/sec=46.9847M/s
   ```
   
   The AVX2 version:
   ```
   ./arrow-acero-hash-join-benchmark --benchmark_filter="BM_RowArray"
   2024-09-01T00:33:14+08:00
   Running ./arrow-acero-hash-join-benchmark
   Run on (8 X 4900 MHz CPU s)
   CPU Caches:
     L1 Data 32 KiB (x8)
     L1 Instruction 32 KiB (x8)
     L2 Unified 256 KiB (x8)
     L3 Unified 12288 KiB (x1)
   Load Average: 0.64, 2.91, 2.31
   ***WARNING*** CPU scaling is enabled, the benchmark real time measurements 
may be noisy and will incur extra overhead.
   
-----------------------------------------------------------------------------------------------------------------------------------------------------------
   Benchmark                                                                    
                             Time             CPU   Iterations UserCounters...
   
-----------------------------------------------------------------------------------------------------------------------------------------------------------
   BM_RowArray_Decode/"boolean"                                                 
                        262395 ns       262341 ns         2665 
rows/sec=249.808M/s
   BM_RowArray_Decode/"int8"                                                    
                        263405 ns       263397 ns         2716 
rows/sec=248.807M/s
   BM_RowArray_Decode/"int16"                                                   
                        248155 ns       248106 ns         2821 
rows/sec=264.141M/s
   BM_RowArray_Decode/"int32"                                                   
                        257523 ns       257519 ns         2825 
rows/sec=254.486M/s
   BM_RowArray_Decode/"int64"                                                   
                        245070 ns       245020 ns         2824 
rows/sec=267.468M/s
   BM_RowArray_DecodeFixedSizeBinary/fixed_size:3                               
                        330801 ns       330759 ns         1980 
rows/sec=198.135M/s
   BM_RowArray_DecodeFixedSizeBinary/fixed_size:5                               
                        327874 ns       327839 ns         2134 rows/sec=199.9M/s
   BM_RowArray_DecodeFixedSizeBinary/fixed_size:6                               
                        331278 ns       331242 ns         1947 
rows/sec=197.846M/s
   BM_RowArray_DecodeFixedSizeBinary/fixed_size:7                               
                        328647 ns       328611 ns         2112 
rows/sec=199.43M/s
   BM_RowArray_DecodeFixedSizeBinary/fixed_size:9                               
                        335129 ns       335101 ns         1937 
rows/sec=195.568M/s
   BM_RowArray_DecodeFixedSizeBinary/fixed_size:16                              
                        347641 ns       347601 ns         2097 
rows/sec=188.535M/s
   BM_RowArray_DecodeFixedSizeBinary/fixed_size:42                              
                        408356 ns       408265 ns         1731 
rows/sec=160.521M/s
   BM_RowArray_DecodeBinary/max_length:32                                       
                        985453 ns       985190 ns          716 
rows/sec=66.5202M/s
   BM_RowArray_DecodeBinary/max_length:64                                       
                       1250078 ns      1249727 ns          560 
rows/sec=52.4394M/s
   BM_RowArray_DecodeBinary/max_length:128                                      
                       1467264 ns      1466902 ns          474 
rows/sec=44.6758M/s
   
BM_RowArray_DecodeOneOfColumns/"fixed_length_row:{boolean,int32,fixed_size_binary(64)}"/column:0
     266468 ns       266456 ns         2365 rows/sec=245.95M/s
   
BM_RowArray_DecodeOneOfColumns/"fixed_length_row:{boolean,int32,fixed_size_binary(64)}"/column:1
     246552 ns       246557 ns         2803 rows/sec=265.8M/s
   
BM_RowArray_DecodeOneOfColumns/"fixed_length_row:{boolean,int32,fixed_size_binary(64)}"/column:2
     437251 ns       437236 ns         1504 rows/sec=149.885M/s
   
BM_RowArray_DecodeOneOfColumns/"var_length_row:{boolean,int32,utf8,utf8}"/column:0
                   455065 ns       455005 ns         1603 rows/sec=144.031M/s
   
BM_RowArray_DecodeOneOfColumns/"var_length_row:{boolean,int32,utf8,utf8}"/column:1
                   445927 ns       445798 ns         1560 rows/sec=147.006M/s
   
BM_RowArray_DecodeOneOfColumns/"var_length_row:{boolean,int32,utf8,utf8}"/column:2
                  1033287 ns      1032913 ns          702 rows/sec=63.4468M/s
   
BM_RowArray_DecodeOneOfColumns/"var_length_row:{boolean,int32,utf8,utf8}"/column:3
                  1193991 ns      1193373 ns          544 rows/sec=54.9158M/s
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to