goldmedal commented on issue #11546:
URL: https://github.com/apache/datafusion/issues/11546#issuecomment-2241056304

   > My concern is that it might be slower because of additional `make_array()` 
computation.
   
   I followed #11526 to create another implementation for `map_func`, called 
`map_one_func` temporarily. I did some benchmarks and found that there are no 
obvious differences between the performance of the `map` and `map_one` 
functions. I also pushed the updated commits in #11560. You could check it.
   The benchmark results: (map is the original design and map_one is the new 
design)
   ```
   Gnuplot not found, using plotters backend
   map_1000                time:   [9.5816 ms 9.6505 ms 9.7221 ms]
                           change: [-3.3395% -1.7212% -0.4746%] (p = 0.02 < 
0.05)
                           Change within noise threshold.
   Found 2 outliers among 100 measurements (2.00%)
     2 (2.00%) high mild
   
   map_one_1000            time:   [9.7250 ms 9.7995 ms 9.8764 ms]
                           change: [-3.4189% -0.6339% +1.3546%] (p = 0.68 > 
0.05)
                           No change in performance detected.
   Found 3 outliers among 100 measurements (3.00%)
     3 (3.00%) high mild
   ```
   I ran the benchmark many times, and each time I got similar results. 
Referring to the result, I think we can just use the original design here. What 
do you think?
   
   By the way, I found that the compile time to run the benchmark in the core 
is very long. It takes about 9 minutes. I'm not sure if that's normal. 😢 
   ```
    jax: ~/git/datafusion/datafusion/core (feature/11546-map-df-api) $ cargo 
bench --bench map_query_sql
      Compiling datafusion-functions v40.0.0 
(/Users/jax/git/datafusion/datafusion/functions)
      Compiling datafusion-functions-array v40.0.0 
(/Users/jax/git/datafusion/datafusion/functions-array)
      Compiling datafusion v40.0.0 (/Users/jax/git/datafusion/datafusion/core)
       Finished `bench` profile [optimized] target(s) in 8m 53s
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to