JasonLi-cn commented on PR #11758: URL: https://github.com/apache/datafusion/pull/11758#issuecomment-2283566894
> @JasonLi-cn As I think, `GroupValues` impls maybe should not care about the `batch size`? And we just do the `split and merge` work in the `GroupedHashAggregateStream::poll` , if unfortunately, the `batch size != block size` (usually they will equal)? > > Maybe we should impl the special block based `GroupValues` impls like following? > > * We pass the `block size` when initializing it > * It manage the inner values block by block > * It return all blocks with internal `block size` > We can always make the `block size == batch size`, so we can totally avoid any split operators. > > I am making a try about it in #11943 , and have done some related code changes. OK. How do we determine the value of block size? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
