Rachelint commented on issue #12335: URL: https://github.com/apache/datafusion/issues/12335#issuecomment-2358310830
> > I guess it may be a bit confused? Because an aggr operator wraps another aggr operator. > > Yeah, I guess it a bit of a double edge sword that it just moves the complexity elsewhere. But I could think well together with your suggestion of > > > I think at least we should separate partial and the teminals(Final, Single...). > > Because without doing something like this the partial would still need all the logic for the terminal. Or at least my understand is that the stream merging is basically just performing a terminal aggregation. Maybe we can refactor the codes thoroughly after https://github.com/apache/datafusion/pull/11943 (which can obviously improve aggr performance) is merged. I think the `partial` and other `terminal`s are more reasonable and maintainable to be split into two different structs, and placed into different files. Because obviously that, some paths only belong to `partial aggr`: - Skipping - Emit early due to memory usage ... And some paths only belong to `terminal`s: - Spilling due to memory usage ... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
