neilconway commented on PR #20588:
URL: https://github.com/apache/datafusion/pull/20588#issuecomment-3993045129
Here's what I get on a Hetzner cloud box (cax31):
```
group base
target
----- ----
------
array_has_all/all_found_small_needle/10 4.70 6.6±0.23ms
? ?/sec 1.00 1407.2±12.51µs ? ?/sec
array_has_all/all_found_small_needle/100 1.46 15.6±0.13ms
? ?/sec 1.00 10.7±0.08ms ? ?/sec
array_has_all/all_found_small_needle/500 1.00 55.9±2.89ms
? ?/sec 1.57 87.6±1.55ms ? ?/sec
array_has_all/not_all_found/10 5.65 6.3±0.13ms
? ?/sec 1.00 1108.0±12.92µs ? ?/sec
array_has_all/not_all_found/100 1.57 14.2±0.15ms
? ?/sec 1.00 9.0±0.08ms ? ?/sec
array_has_all/not_all_found/500 1.00 49.6±0.43ms
? ?/sec 1.63 80.7±1.71ms ? ?/sec
array_has_all_strings/all_found/10 2.21 4.6±0.04ms
? ?/sec 1.00 2.1±0.04ms ? ?/sec
array_has_all_strings/all_found/100 1.29 15.4±0.12ms
? ?/sec 1.00 11.9±0.15ms ? ?/sec
array_has_all_strings/all_found/500 1.00 59.2±0.94ms
? ?/sec 1.56 92.6±1.24ms ? ?/sec
array_has_all_strings/not_all_found/10 2.82 3.9±0.03ms
? ?/sec 1.00 1386.6±17.25µs ? ?/sec
array_has_all_strings/not_all_found/100 1.34 13.7±0.11ms
? ?/sec 1.00 10.2±0.15ms ? ?/sec
array_has_all_strings/not_all_found/500 1.00 70.5±0.74ms
? ?/sec 1.46 102.8±1.37ms ? ?/sec
array_has_any/no_match/10 3.17 7.3±0.04ms
? ?/sec 1.00 2.3±0.03ms ? ?/sec
array_has_any/no_match/100 1.19 23.2±0.36ms
? ?/sec 1.00 19.5±0.25ms ? ?/sec
array_has_any/no_match/500 1.00 93.4±0.86ms
? ?/sec 1.41 131.6±1.76ms ? ?/sec
array_has_any/scalar_no_match/10 1.00 2.2±0.02ms
? ?/sec 1.01 2.2±0.02ms ? ?/sec
array_has_any/scalar_no_match/100 1.00 20.9±0.33ms
? ?/sec 1.01 21.1±0.17ms ? ?/sec
array_has_any/scalar_no_match/500 1.00 138.1±1.94ms
? ?/sec 1.03 142.3±1.72ms ? ?/sec
array_has_any/scalar_some_match/10 1.00 1070.3±16.72µs
? ?/sec 1.00 1069.4±13.97µs ? ?/sec
array_has_any/scalar_some_match/100 1.00 11.1±0.11ms
? ?/sec 1.03 11.4±0.18ms ? ?/sec
array_has_any/scalar_some_match/500 1.00 85.8±1.13ms
? ?/sec 1.02 87.5±1.15ms ? ?/sec
array_has_any/some_match/10 4.94 6.4±0.11ms
? ?/sec 1.00 1298.0±19.57µs ? ?/sec
array_has_any/some_match/100 1.35 14.8±0.11ms
? ?/sec 1.00 10.9±0.11ms ? ?/sec
array_has_any/some_match/500 1.00 51.2±0.61ms
? ?/sec 1.74 89.1±1.53ms ? ?/sec
array_has_any_scalar/i64_no_match/1 1.00 375.8±4.58µs
? ?/sec 1.02 383.9±5.82µs ? ?/sec
array_has_any_scalar/i64_no_match/10 1.07 486.9±59.32µs
? ?/sec 1.00 453.7±12.16µs ? ?/sec
array_has_any_scalar/i64_no_match/100 1.00 639.7±22.48µs
? ?/sec 1.00 637.8±26.00µs ? ?/sec
array_has_any_scalar/i64_no_match/1000 1.01 556.6±21.28µs
? ?/sec 1.00 549.5±13.52µs ? ?/sec
array_has_any_scalar/string_no_match/1 1.00 251.5±2.22µs
? ?/sec 1.03 258.6±2.65µs ? ?/sec
array_has_any_scalar/string_no_match/10 1.03 437.6±10.87µs
? ?/sec 1.00 424.6±7.96µs ? ?/sec
array_has_any_scalar/string_no_match/100 1.00 552.0±15.64µs
? ?/sec 1.02 564.1±23.50µs ? ?/sec
array_has_any_scalar/string_no_match/1000 1.00 465.9±16.90µs
? ?/sec 1.01 469.5±10.49µs ? ?/sec
array_has_any_strings/no_match/10 2.09 5.0±0.04ms
? ?/sec 1.00 2.4±0.03ms ? ?/sec
array_has_any_strings/no_match/100 1.22 21.5±0.13ms
? ?/sec 1.00 17.7±0.24ms ? ?/sec
array_has_any_strings/no_match/500 1.00 131.0±0.73ms
? ?/sec 1.22 159.6±2.74ms ? ?/sec
array_has_any_strings/scalar_no_match/10 1.00 876.5±5.30µs
? ?/sec 1.06 924.9±16.90µs ? ?/sec
array_has_any_strings/scalar_no_match/100 1.00 7.5±0.07ms
? ?/sec 1.06 8.0±0.11ms ? ?/sec
array_has_any_strings/scalar_no_match/500 1.00 86.4±0.53ms
? ?/sec 1.02 88.5±1.01ms ? ?/sec
array_has_any_strings/scalar_some_match/10 1.00 761.8±6.51µs
? ?/sec 1.02 774.4±7.06µs ? ?/sec
array_has_any_strings/scalar_some_match/100 1.00 5.1±0.14ms
? ?/sec 1.07 5.5±0.34ms ? ?/sec
array_has_any_strings/scalar_some_match/500 1.00 17.4±0.15ms
? ?/sec 1.05 18.3±0.23ms ? ?/sec
array_has_any_strings/some_match/10 2.43 4.3±0.03ms
? ?/sec 1.00 1763.5±21.29µs ? ?/sec
array_has_any_strings/some_match/100 1.30 14.1±0.15ms
? ?/sec 1.00 10.8±0.19ms ? ?/sec
array_has_any_strings/some_match/500 1.00 53.2±0.65ms
? ?/sec 1.61 85.6±1.75ms ? ?/sec
array_has_i64/found/10 1.00 149.7±6.18µs
? ?/sec 1.03 154.2±6.32µs ? ?/sec
array_has_i64/found/100 1.00 613.2±101.28µs
? ?/sec 1.04 639.6±77.43µs ? ?/sec
array_has_i64/found/500 1.00 4.4±0.12ms
? ?/sec 1.04 4.6±0.11ms ? ?/sec
array_has_i64/not_found/10 1.04 71.7±1.02µs
? ?/sec 1.00 68.7±1.93µs ? ?/sec
array_has_i64/not_found/100 1.00 426.7±17.72µs
? ?/sec 1.03 440.9±24.04µs ? ?/sec
array_has_i64/not_found/500 1.00 4.4±0.11ms
? ?/sec 1.02 4.5±0.15ms ? ?/sec
array_has_strings/found/10 1.00 685.4±6.84µs
? ?/sec 1.00 688.6±6.93µs ? ?/sec
array_has_strings/found/100 1.00 2.6±0.06ms
? ?/sec 1.03 2.7±0.04ms ? ?/sec
array_has_strings/found/500 1.00 15.1±0.18ms
? ?/sec 1.03 15.6±0.43ms ? ?/sec
array_has_strings/not_found/10 1.01 152.6±0.82µs
? ?/sec 1.00 150.7±2.05µs ? ?/sec
array_has_strings/not_found/100 1.00 5.8±0.04ms
? ?/sec 1.02 5.9±0.12ms ? ?/sec
array_has_strings/not_found/500 1.03 16.7±0.26ms
? ?/sec 1.00 16.2±0.19ms ? ?/sec
```
So we do indeed see some regressions for large arrays. I'm not entirely sure
why that would be ... I suppose for 10k rows * 500 elements we end up pushing a
lot more stuff out of L1/L2, whereas the previous approach uses a smaller
working set. I'm surprised that the effect is that pronounced, though.
Let me try doing the row conversion in smaller batches and see if that helps.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]