On Mon, Apr 25, 2011 at 1:03 PM, Dmitriy Lyubimov <[email protected]> wrote:
> There's nothing in Mahout to do this (afaik). > > There's a statistical method for this doing that in linear time, which > is very easy to implement (and btw would make an excellent addition to > Mahout), google up 'countsketch'. > Taking top k items in a list of length n, when n >> k, is already O(n), if you check the bottom of heap first before inserting. Not sure if countsketch is necessary for the usual use case. -jake
