felipecrv commented on issue #37919: URL: https://github.com/apache/arrow/issues/37919#issuecomment-1776318524
> Hmm I don't think any of the current aggregate kernels explicitly destruct the kernel state in `Finalize`. Can you give an example of such cases? @js8544 Finalizing the `counts_` `BufferBuilder` here https://github.com/apache/arrow/blob/main/cpp/src/arrow/compute/kernels/hash_aggregate.cc#L268 Finalizing a `BufferBuilder` means buffer ownership is passed the caller and even if the shared_ptr remained shared between caller and builder, the builder wouldn't be allowed to mutate the buffer as the user of the builder doesn't expect it's buffer to be mutated by the builder anymore. ---- @icexelloss the aggregator kernels leverage the fact that their state is mutable, but with only one mutable reference to them (unique mutability) and when `Finish` is called they can pass ownership of internal data to the caller (avoiding the need of expensive copies and re-allocations). So asking for a non-destructive `Finalize` as a general interface that every kernel would have to implement is quite disturbing to the model. Can you solve your issue by adding a `std::unique_ptr<KernelState> Clone()` function to the kernels that lend themselves well to this optimization (i.e. kernels that have a compact state)? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
