assuming of course that you don't have enough space to put all
distinct items into the heap?

On Mon, Apr 25, 2011 at 2:03 PM, Dmitriy Lyubimov <[email protected]> wrote:
> if you "forget" some of the items you saw along with the counters,
> how'd you add them back to the heap?
>
>
> On Mon, Apr 25, 2011 at 1:08 PM, Jake Mannix <[email protected]> wrote:
>> On Mon, Apr 25, 2011 at 1:03 PM, Dmitriy Lyubimov <[email protected]> wrote:
>>
>>> There's nothing in Mahout to do this (afaik).
>>>
>>> There's a statistical method for this doing that in linear time, which
>>> is very easy to implement (and btw would make an excellent addition to
>>> Mahout), google up 'countsketch'.
>>>
>>
>> Taking top k items in a list of length n, when n >> k, is already O(n), if
>> you check the bottom of heap first before inserting.  Not sure if
>> countsketch
>> is necessary for the usual use case.
>>
>>  -jake
>>
>

Reply via email to