mustafasrepo commented on PR #6617: URL: https://github.com/apache/arrow-datafusion/pull/6617#issuecomment-1587430709
Thanks @alamb for this proposal. I have examined this PR. Overall it is very well written, and showcases the usage of the new API. However, I have some concerns about flexibility of `WindowUDF` and `AggregateUDF`. Specifically, user can implement multiple different `WindowUDF`s. However, all implementations are initialized from `WindowUDFExpr` (There is a similar pattern in `AggregateUDF`. All `AggregateUDF`s are initialized from `AggregateFunctionExpr`). `BuiltInWindowFunctionExpr` (`WindowUDFExpr` implements this trait) contains methods such as `supports_bounded_execution`, `uses_window_frame`, `include_rank`. These methods effect which methods will be used during evaluator. For instance if user wants to add a evaluator that uses window frame boundaries, user should be able to set flag `uses_window_frame` for the evaluator (Hence when `WindowUDFExpr` is implemented, these fields are fixed, and users have no control over them). Hence I think, we should either support for user to create custom `BuiltInWindowFunctionExpr` for each new `WindowUDF` (I think this is hard and more cumbersome on the user side) or we should move all the parameters that effects the decision (such as `supports_bounded_execution`, `uses_window_frame`, `include_rank`) to the `evaluator and/or aggregator` side. By this way user will have full control on the custom implementation. I am working on a new design to solve this problem. @metesynnada and I will discuss it tomorrow, and let you know about the final result. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
