leerho commented on issue #7187: Improve topN algorithm URL: https://github.com/apache/incubator-druid/issues/7187#issuecomment-470408649 Thanks for your clarifications. I was led to believe that the TopN process was the one I described. A PriorityQueue is a Min-heap so it sounds like it is already doing close to what I was suggesting. As for the HLL-Map Sketch, one of its requirements was that it keep every key and its latest count estimate on every unique key that it has seen to allow arbitrary queries of any key. This consumes a huge amount of space. And there is no requirement for this here. What I imagine would be a FIS-HLL combination that would be considerably smaller. It’s size would be O(k).
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
