goldmedal commented on issue #11546: URL: https://github.com/apache/datafusion/issues/11546#issuecomment-2241056304
> My concern is that it might be slower because of additional `make_array()` computation. I followed #11526 to create another implementation for `map_func`, called `map_one_func` temporarily. I did some benchmarks and found that there are no obvious differences between the performance of the `map` and `map_one` functions. I also pushed the updated commits in #11560. You could check it. The benchmark results: (map is the original design and map_one is the new design) ``` Gnuplot not found, using plotters backend map_1000 time: [9.5816 ms 9.6505 ms 9.7221 ms] change: [-3.3395% -1.7212% -0.4746%] (p = 0.02 < 0.05) Change within noise threshold. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild map_one_1000 time: [9.7250 ms 9.7995 ms 9.8764 ms] change: [-3.4189% -0.6339% +1.3546%] (p = 0.68 > 0.05) No change in performance detected. Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild ``` I ran the benchmark many times, and each time I got similar results. Referring to the result, I think we can just use the original design here. What do you think? By the way, I found that the compile time to run the benchmark in the core is very long. It takes about 9 minutes. I'm not sure if that's normal. 😢 ``` jax: ~/git/datafusion/datafusion/core (feature/11546-map-df-api) $ cargo bench --bench map_query_sql Compiling datafusion-functions v40.0.0 (/Users/jax/git/datafusion/datafusion/functions) Compiling datafusion-functions-array v40.0.0 (/Users/jax/git/datafusion/datafusion/functions-array) Compiling datafusion v40.0.0 (/Users/jax/git/datafusion/datafusion/core) Finished `bench` profile [optimized] target(s) in 8m 53s ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org