yjshen commented on issue #4973: URL: https://github.com/apache/arrow-datafusion/issues/4973#issuecomment-1454369204
I'm curious about the proposed new Aggregator API. Do you know if it needs a hash table in each aggregator? I'm wondering because I'm a bit concerned about memory usage, especially for high cardinality aggregations. Suppose keys are duplicated `n` times during execution (where `n` is the number of aggregators in the query). In that case, this could potentially lead to a significant increase in memory consumption, which might not be acceptable. What do you think about this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
