bharath-techie commented on issue #19386: URL: https://github.com/apache/datafusion/issues/19386#issuecomment-3696425376
Yes agreed on compact causing some performance impact, we also tried out one more approach of using jemalloc based memory pool like the one used in influxdb for accounting [ instead we directly used it as memory pool for allocations ] - that also seemed to work out well in initial testing and was able to overcome the multiple counting issue. Though we didn't do extensive testing. I'm happy to try out a working solution on top of #19501 as well for topK once its ready. I was already tracking the associated github issues and it looks promising :) I still feel that force compact on topK instead of throwing error is still a good fallback solution. As atleast the query will go through after doing significant amount of work in groupby etc. Plus with your change, we might not end up force compacting in many scenarios as we'll be able to correctly account for memory. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
