alamb commented on PR #6617: URL: https://github.com/apache/arrow-datafusion/pull/6617#issuecomment-1587657307
> Hence I think, we should either support for user to create custom BuiltInWindowFunctionExpr for each new WindowUDF (I think this is hard and more cumbersome on the user side) or we should move all the parameters that effects the decision (such as supports_bounded_execution, uses_window_frame, include_rank) to the evaluator and/or aggregator side. By this way user will have full control on the custom implementation. thank you for the response @mustafasrepo -- this makes sense to me. I agree it would be better if the type system could help enforce functions that needed to be implemented Are you thinking something like this? ``` pub trait Accumulator: Send + Sync + Debug { // Required methods fn state(&self) -> Result<Vec<ScalarValue, Global>, DataFusionError>; fn update_batch( &mut self, values: &[Arc<dyn Array + 'static>] ) -> Result<(), DataFusionError>; fn merge_batch( &mut self, states: &[Arc<dyn Array + 'static>] ) -> Result<(), DataFusionError>; fn evaluate(&self) -> Result<ScalarValue, DataFusionError>; fn size(&self) -> usize; } /// A trait for accumualtors that also provide the ability to remove rows from a window incrementally pub trait RetractableAccumulator: Accumulator {. // <---- add this trait // Provided method fn retract_batch( &mut self, _values: &[Arc<dyn Array + 'static>] ) -> Result<(), DataFusionError> { ... } } ``` And then adding a separate function to AggregateUDF: ```rust enum AccumulatorImpl {. // <---- add this enum NonRetractable(Box<dyn Accumulator + 'static> /// returned if the accumulator supports retractable windows Retractable(Box<dyn RetractableAccumulator + 'static>), } pub struct AggregateUDF { pub name: ..., pub signature: ..., pub return_type: ... // change this signature ----+v pub accumulator: Arc<dyn Fn(&DataType) -> Result<AccumulatorImpl>, DataFusionError> + Sync + Send + 'static>, pub state_type: ... } ``` ? Also related is https://github.com/apache/arrow-datafusion/issues/6611 we found (which is one of the features of the `Accumulator` that is not yet exposed by AggregateUDF) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org