Re: [PR] GH-40866: [C++][Python] Basic conversion of RecordBatch to Arrow Tensor - add support for row-major [arrow]

via GitHub Thu, 28 Mar 2024 07:28:50 -0700


AlenkaF commented on PR #40867:
URL: https://github.com/apache/arrow/pull/40867#issuecomment-2025342730


   I have run the benchmarks for uniform data types on this branch. Currently 
the benchmarks have not been updated to use columnar-major layout as before 
(`row_major=false`) but use row-major layout (set to be the default in this PR) 
and so the diff of the benchmarks is actually measuring the difference between 
column-major (baseline) and row-major (contender) conversion - which is great 
to see:
   
   ```
   (pyarrow-dev) alenkafrim@Alenkas-MacBook-Pro arrow % archery --quiet 
benchmark diff --benchmark-filter=BatchToTensorSimple
   
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   Non-regressions: (1)
   
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                                                   benchmark      baseline     
contender  change %                                                             
                                                                                
                                                    counters
   BatchToTensorSimple<Int64Type>/size:65536/num_columns:300 1.257 GiB/sec 
1.206 GiB/sec    -4.048 {'family_index': 3, 'per_family_instance_index': 2, 
'run_name': 'BatchToTensorSimple<Int64Type>/size:65536/num_columns:300', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 14562}
   
   
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   Regressions: (23)
   
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                                                     benchmark       baseline   
    contender  change %                                                         
                                                                                
                                                         counters
     BatchToTensorSimple<Int32Type>/size:65536/num_columns:300  1.251 GiB/sec   
1.160 GiB/sec    -7.312  {'family_index': 2, 'per_family_instance_index': 2, 
'run_name': 'BatchToTensorSimple<Int32Type>/size:65536/num_columns:300', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 14214}
     BatchToTensorSimple<Int16Type>/size:65536/num_columns:300  1.219 GiB/sec   
1.002 GiB/sec   -17.740  {'family_index': 1, 'per_family_instance_index': 2, 
'run_name': 'BatchToTensorSimple<Int16Type>/size:65536/num_columns:300', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 14454}
      BatchToTensorSimple<Int64Type>/size:65536/num_columns:30  9.126 GiB/sec   
7.340 GiB/sec   -19.563  {'family_index': 3, 'per_family_instance_index': 1, 
'run_name': 'BatchToTensorSimple<Int64Type>/size:65536/num_columns:30', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 102624}
   BatchToTensorSimple<Int64Type>/size:4194304/num_columns:300 11.804 GiB/sec   
7.882 GiB/sec   -33.223 {'family_index': 3, 'per_family_instance_index': 5, 
'run_name': 'BatchToTensorSimple<Int64Type>/size:4194304/num_columns:300', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2156}
       BatchToTensorSimple<Int64Type>/size:65536/num_columns:3 26.569 GiB/sec  
17.379 GiB/sec   -34.590   {'family_index': 3, 'per_family_instance_index': 0, 
'run_name': 'BatchToTensorSimple<Int64Type>/size:65536/num_columns:3', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 369358}
     BatchToTensorSimple<Int64Type>/size:4194304/num_columns:3 14.526 GiB/sec   
8.555 GiB/sec   -41.104   {'family_index': 3, 'per_family_instance_index': 3, 
'run_name': 'BatchToTensorSimple<Int64Type>/size:4194304/num_columns:3', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2579}
      BatchToTensorSimple<Int32Type>/size:65536/num_columns:30  9.299 GiB/sec   
5.168 GiB/sec   -44.419  {'family_index': 2, 'per_family_instance_index': 1, 
'run_name': 'BatchToTensorSimple<Int32Type>/size:65536/num_columns:30', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 103448}
      BatchToTensorSimple<Int8Type>/size:65536/num_columns:300  1.240 GiB/sec 
660.038 MiB/sec   -48.011   {'family_index': 0, 'per_family_instance_index': 2, 
'run_name': 'BatchToTensorSimple<Int8Type>/size:65536/num_columns:300', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 13673}
     BatchToTensorSimple<Int32Type>/size:4194304/num_columns:3 14.236 GiB/sec   
6.438 GiB/sec   -54.776   {'family_index': 2, 'per_family_instance_index': 3, 
'run_name': 'BatchToTensorSimple<Int32Type>/size:4194304/num_columns:3', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2583}
      BatchToTensorSimple<Int16Type>/size:65536/num_columns:30  9.152 GiB/sec   
3.796 GiB/sec   -58.521  {'family_index': 1, 'per_family_instance_index': 1, 
'run_name': 'BatchToTensorSimple<Int16Type>/size:65536/num_columns:30', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 105352}
    BatchToTensorSimple<Int64Type>/size:4194304/num_columns:30 13.652 GiB/sec   
5.379 GiB/sec   -60.597  {'family_index': 3, 'per_family_instance_index': 4, 
'run_name': 'BatchToTensorSimple<Int64Type>/size:4194304/num_columns:30', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2499}
       BatchToTensorSimple<Int32Type>/size:65536/num_columns:3 27.147 GiB/sec   
8.674 GiB/sec   -68.049   {'family_index': 2, 'per_family_instance_index': 0, 
'run_name': 'BatchToTensorSimple<Int32Type>/size:65536/num_columns:3', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 343999}
     BatchToTensorSimple<Int16Type>/size:4194304/num_columns:3 14.370 GiB/sec   
4.404 GiB/sec   -69.348   {'family_index': 1, 'per_family_instance_index': 3, 
'run_name': 'BatchToTensorSimple<Int16Type>/size:4194304/num_columns:3', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2673}
   BatchToTensorSimple<Int32Type>/size:4194304/num_columns:300 12.017 GiB/sec   
3.332 GiB/sec   -72.269 {'family_index': 2, 'per_family_instance_index': 5, 
'run_name': 'BatchToTensorSimple<Int32Type>/size:4194304/num_columns:300', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2158}
       BatchToTensorSimple<Int16Type>/size:65536/num_columns:3 24.767 GiB/sec   
5.370 GiB/sec   -78.317   {'family_index': 1, 'per_family_instance_index': 0, 
'run_name': 'BatchToTensorSimple<Int16Type>/size:65536/num_columns:3', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 332207}
    BatchToTensorSimple<Int32Type>/size:4194304/num_columns:30 13.938 GiB/sec   
2.928 GiB/sec   -78.994  {'family_index': 2, 'per_family_instance_index': 4, 
'run_name': 'BatchToTensorSimple<Int32Type>/size:4194304/num_columns:30', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2446}
    BatchToTensorSimple<Int16Type>/size:4194304/num_columns:30 12.799 GiB/sec   
2.006 GiB/sec   -84.327  {'family_index': 1, 'per_family_instance_index': 4, 
'run_name': 'BatchToTensorSimple<Int16Type>/size:4194304/num_columns:30', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2448}
   BatchToTensorSimple<Int16Type>/size:4194304/num_columns:300 12.092 GiB/sec   
1.859 GiB/sec   -84.624 {'family_index': 1, 'per_family_instance_index': 5, 
'run_name': 'BatchToTensorSimple<Int16Type>/size:4194304/num_columns:300', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2196}
       BatchToTensorSimple<Int8Type>/size:65536/num_columns:30  9.130 GiB/sec   
1.236 GiB/sec   -86.461   {'family_index': 0, 'per_family_instance_index': 1, 
'run_name': 'BatchToTensorSimple<Int8Type>/size:65536/num_columns:30', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 103747}
      BatchToTensorSimple<Int8Type>/size:4194304/num_columns:3 13.566 GiB/sec   
1.263 GiB/sec   -90.691    {'family_index': 0, 'per_family_instance_index': 3, 
'run_name': 'BatchToTensorSimple<Int8Type>/size:4194304/num_columns:3', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2480}
     BatchToTensorSimple<Int8Type>/size:4194304/num_columns:30 13.245 GiB/sec 
939.018 MiB/sec   -93.077   {'family_index': 0, 'per_family_instance_index': 4, 
'run_name': 'BatchToTensorSimple<Int8Type>/size:4194304/num_columns:30', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2366}
    BatchToTensorSimple<Int8Type>/size:4194304/num_columns:300 11.459 GiB/sec 
702.520 MiB/sec   -94.013  {'family_index': 0, 'per_family_instance_index': 5, 
'run_name': 'BatchToTensorSimple<Int8Type>/size:4194304/num_columns:300', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2029}
        BatchToTensorSimple<Int8Type>/size:65536/num_columns:3 29.609 GiB/sec   
1.391 GiB/sec   -95.302    {'family_index': 0, 'per_family_instance_index': 0, 
'run_name': 'BatchToTensorSimple<Int8Type>/size:65536/num_columns:3', 
'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 294453}
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] GH-40866: [C++][Python] Basic conversion of RecordBatch to Arrow Tensor - add support for row-major [arrow]

Reply via email to