jayzhan211 commented on PR #11361: URL: https://github.com/apache/datafusion/pull/11361#issuecomment-2221710127
> > Thanks @jayzhan211. Sounds great. Curiously, what's the benefit of moving to `ExprPlanner`? I'm not sure, but I think it just moves the cost of aggregating column values to `ExprPlanner`, right? Or is it possible to get better performance? > > I agree with @jayzhan211 it seems less than ideal to have two functions rather than just one. However I agree with @goldmedal that it isn't clear that this is all that much better. > > It looks to me like duckdb has several functions to create `map`s as well. https://duckdb.org/docs/sql/data_types/map > > They support this kind of literal syntax > > ```sql > SELECT MAP {'key1': 10, 'key2': 20, 'key3': 30}; > ``` > > As well as the parallel lists implementation > > ```sql > SELECT MAP(['key1', 'key2', 'key3'], [10, 20, 30]); > ``` > > They also have a function similar to `make_map` like this: > > ```sql > SELECT map_from_entries([('key1', 10), ('key2', 20), ('key3', 30)]); > ``` I think we can have several functions frontend but one single function in functions crate, with ExprPlanner, we can arrange the args for the single MapFunc. Also, the biggest reason why `make_map` is slow is because the way we rearrange args involved tons of clone, it is relatively easier if we rearrange args in ExprPlanner because it is the place we collect args. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org