goldmedal commented on issue #11546:
URL: https://github.com/apache/datafusion/issues/11546#issuecomment-2241195188
Ok, I think it's getting worse.
```
Gnuplot not found, using plotters backend
map_1000 time: [9.1289 ms 9.1897 ms 9.2537 ms]
change: [-0.7915% +0.1565% +1.1967%] (p = 0.75 >
0.05)
No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
Benchmarking map_one_1000: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase
target time to 6.4s, or reduce sample count to 70.
map_one_1000 time: [62.787 ms 63.239 ms 63.732 ms]
change: [-2.6609% -1.0620% +0.3899%] (p = 0.17 >
0.05)
No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
```
I also tried to remove
```
let mut args = keys;
args.extend(values);
```
Just pass an args vector to `map_from_array`, but it's still slower. I
pushed this version to a different branch:
https://github.com/goldmedal/datafusion/blob/feature/11546-map-df-api-v4/datafusion/functions/src/core/map.rs
If you're interested, you can check it out.
Actually, I found that `make_scalar_function` uses
`ColumnarValue::values_to_arrays`, so I need to use `make_array_inner` to
aggregate the primitive arrays.
In conclusion, the original design (using `make_array`) is the fastest.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]