pitrou commented on PR #41700:
URL: https://github.com/apache/arrow/pull/41700#issuecomment-2302064092

   I ran the new benchmarks (those with a small selection factor) and the 
results are more varied, see below:
   ```
   
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   Non-regressions: (13)
   
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                                                       benchmark           
baseline          contender  change %                                           
                                                                                
                                                                                
                                       counters
                TakeChunkedFlatInt64FewMonotonicIndices/524288/0   3.487G 
items/sec   6.131G items/sec    75.818             {'family_index': 5, 
'per_family_instance_index': 4, 'run_name': 
'TakeChunkedFlatInt64FewMonotonicIndices/524288/0', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 4676, 'null_percent': 0.0, 
'selection_factor': 0.05}
             TakeChunkedChunkedInt64FewMonotonicIndices/524288/0   3.397G 
items/sec   5.278G items/sec    55.353          {'family_index': 1, 
'per_family_instance_index': 4, 'run_name': 
'TakeChunkedChunkedInt64FewMonotonicIndices/524288/0', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 4603, 'null_percent': 0.0, 
'selection_factor': 0.05}
                TakeChunkedFlatInt64FewMonotonicIndices/524288/1   3.019G 
items/sec   4.508G items/sec    49.291           {'family_index': 5, 
'per_family_instance_index': 3, 'run_name': 
'TakeChunkedFlatInt64FewMonotonicIndices/524288/1', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 4000, 'null_percent': 100.0, 
'selection_factor': 0.05}
             TakeChunkedChunkedInt64FewMonotonicIndices/524288/1   2.802G 
items/sec   4.135G items/sec    47.589        {'family_index': 1, 
'per_family_instance_index': 3, 'run_name': 
'TakeChunkedChunkedInt64FewMonotonicIndices/524288/1', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 3740, 'null_percent': 100.0, 
'selection_factor': 0.05}
             TakeChunkedFlatInt64FewMonotonicIndices/524288/1000   2.316G 
items/sec   2.910G items/sec    25.628          {'family_index': 5, 
'per_family_instance_index': 0, 'run_name': 
'TakeChunkedFlatInt64FewMonotonicIndices/524288/1000', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 3104, 'null_percent': 0.1, 
'selection_factor': 0.05}
          TakeChunkedChunkedInt64FewMonotonicIndices/524288/1000   2.254G 
items/sec   2.684G items/sec    19.075       {'family_index': 1, 
'per_family_instance_index': 0, 'run_name': 
'TakeChunkedChunkedInt64FewMonotonicIndices/524288/1000', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 3082, 'null_percent': 0.1, 
'selection_factor': 0.05}
               TakeChunkedFlatInt64FewMonotonicIndices/524288/10   2.257G 
items/sec   2.635G items/sec    16.733           {'family_index': 5, 
'per_family_instance_index': 1, 'run_name': 
'TakeChunkedFlatInt64FewMonotonicIndices/524288/10', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 3052, 'null_percent': 10.0, 
'selection_factor': 0.05}
            TakeChunkedChunkedInt64FewMonotonicIndices/524288/10   2.179G 
items/sec   2.466G items/sec    13.175        {'family_index': 1, 
'per_family_instance_index': 1, 'run_name': 
'TakeChunkedChunkedInt64FewMonotonicIndices/524288/10', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 2878, 'null_percent': 10.0, 
'selection_factor': 0.05}
      TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/1   5.541G 
items/sec   5.568G items/sec     0.490 {'family_index': 2, 
'per_family_instance_index': 3, 'run_name': 
'TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/1', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 7344, 'null_percent': 100.0, 
'selection_factor': 0.05}
            TakeChunkedChunkedStringFewMonotonicIndices/524288/1   2.753G 
items/sec   2.736G items/sec    -0.625       {'family_index': 3, 
'per_family_instance_index': 3, 'run_name': 
'TakeChunkedChunkedStringFewMonotonicIndices/524288/1', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 3674, 'null_percent': 100.0, 
'selection_factor': 0.05}
                TakeChunkedFlatInt64FewMonotonicIndices/524288/2   1.880G 
items/sec   1.852G items/sec    -1.442            {'family_index': 5, 
'per_family_instance_index': 2, 'run_name': 
'TakeChunkedFlatInt64FewMonotonicIndices/524288/2', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 2520, 'null_percent': 50.0, 
'selection_factor': 0.05}
             TakeChunkedChunkedInt64FewMonotonicIndices/524288/2   1.818G 
items/sec   1.766G items/sec    -2.872         {'family_index': 1, 
'per_family_instance_index': 2, 'run_name': 
'TakeChunkedChunkedInt64FewMonotonicIndices/524288/2', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 2403, 'null_percent': 50.0, 
'selection_factor': 0.05}
   TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/1000 269.164M 
items/sec 256.848M items/sec    -4.576 {'family_index': 2, 
'per_family_instance_index': 0, 'run_name': 
'TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/1000', 'repetitions': 
1, 'repetition_index': 0, 'threads': 1, 'iterations': 358, 'null_percent': 0.1, 
'selection_factor': 0.05}
   
   
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   Regressions: (17)
   
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                                                      benchmark           
baseline          contender  change %                                           
                                                                                
                                                                                
                                       counters
           TakeChunkedChunkedStringFewMonotonicIndices/524288/2 854.187M 
items/sec 810.956M items/sec    -5.061        {'family_index': 3, 
'per_family_instance_index': 2, 'run_name': 
'TakeChunkedChunkedStringFewMonotonicIndices/524288/2', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 1124, 'null_percent': 50.0, 
'selection_factor': 0.05}
           TakeChunkedChunkedStringFewMonotonicIndices/524288/0 420.243M 
items/sec 393.347M items/sec    -6.400          {'family_index': 3, 
'per_family_instance_index': 4, 'run_name': 
'TakeChunkedChunkedStringFewMonotonicIndices/524288/0', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 561, 'null_percent': 0.0, 
'selection_factor': 0.05}
     TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/2 751.346M 
items/sec 700.885M items/sec    -6.716   {'family_index': 2, 
'per_family_instance_index': 2, 'run_name': 
'TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/2', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 982, 'null_percent': 50.0, 
'selection_factor': 0.05}
        TakeChunkedChunkedStringFewMonotonicIndices/524288/1000 383.643M 
items/sec 353.206M items/sec    -7.934       {'family_index': 3, 
'per_family_instance_index': 0, 'run_name': 
'TakeChunkedChunkedStringFewMonotonicIndices/524288/1000', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 511, 'null_percent': 0.1, 
'selection_factor': 0.05}
     TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/0 308.601M 
items/sec 283.361M items/sec    -8.179    {'family_index': 2, 
'per_family_instance_index': 4, 'run_name': 
'TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/0', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 411, 'null_percent': 0.0, 
'selection_factor': 0.05}
    TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/10 327.708M 
items/sec 300.772M items/sec    -8.220  {'family_index': 2, 
'per_family_instance_index': 1, 'run_name': 
'TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/10', 'repetitions': 
1, 'repetition_index': 0, 'threads': 1, 'iterations': 440, 'null_percent': 
10.0, 'selection_factor': 0.05}
          TakeChunkedChunkedStringFewMonotonicIndices/524288/10 452.054M 
items/sec 393.971M items/sec   -12.849        {'family_index': 3, 
'per_family_instance_index': 1, 'run_name': 
'TakeChunkedChunkedStringFewMonotonicIndices/524288/10', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 602, 'null_percent': 10.0, 
'selection_factor': 0.05}
      TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/2   1.276G 
items/sec 509.634M items/sec   -60.060   {'family_index': 0, 
'per_family_instance_index': 2, 'run_name': 
'TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/2', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 1691, 'null_percent': 50.0, 
'selection_factor': 0.05}
         TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/2   1.323G 
items/sec 517.354M items/sec   -60.894      {'family_index': 4, 
'per_family_instance_index': 2, 'run_name': 
'TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/2', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 1750, 'null_percent': 50.0, 
'selection_factor': 0.05}
     TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/10   1.615G 
items/sec 549.705M items/sec   -65.960  {'family_index': 0, 
'per_family_instance_index': 1, 'run_name': 
'TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/10', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 2171, 'null_percent': 10.0, 
'selection_factor': 0.05}
        TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/10   1.709G 
items/sec 553.717M items/sec   -67.595     {'family_index': 4, 
'per_family_instance_index': 1, 'run_name': 
'TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/10', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 2274, 'null_percent': 10.0, 
'selection_factor': 0.05}
   TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/1000   2.026G 
items/sec 587.265M items/sec   -71.021 {'family_index': 0, 
'per_family_instance_index': 0, 'run_name': 
'TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/1000', 'repetitions': 
1, 'repetition_index': 0, 'threads': 1, 'iterations': 2670, 'null_percent': 
0.1, 'selection_factor': 0.05}
      TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/1000   2.078G 
items/sec 588.468M items/sec   -71.680    {'family_index': 4, 
'per_family_instance_index': 0, 'run_name': 
'TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/1000', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 2771, 'null_percent': 0.1, 
'selection_factor': 0.05}
      TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/0   3.429G 
items/sec 717.438M items/sec   -79.076    {'family_index': 0, 
'per_family_instance_index': 4, 'run_name': 
'TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/0', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 4639, 'null_percent': 0.0, 
'selection_factor': 0.05}
         TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/0   3.582G 
items/sec 721.296M items/sec   -79.861       {'family_index': 4, 
'per_family_instance_index': 4, 'run_name': 
'TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/0', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 4799, 'null_percent': 0.0, 
'selection_factor': 0.05}
      TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/1   3.873G 
items/sec 759.915M items/sec   -80.381  {'family_index': 0, 
'per_family_instance_index': 3, 'run_name': 
'TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/1', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 5316, 'null_percent': 100.0, 
'selection_factor': 0.05}
         TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/1   4.173G 
items/sec 761.551M items/sec   -81.751     {'family_index': 4, 
'per_family_instance_index': 3, 'run_name': 
'TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/1', 'repetitions': 1, 
'repetition_index': 0, 'threads': 1, 'iterations': 5385, 'null_percent': 100.0, 
'selection_factor': 0.05}
   ```
   
   Putting aside the String perf changes which are minor and probably 
irrelevant, we can see that on monotonic indices, this PR actually increases 
performance, while still hurting it on random indices.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to