On Mon, Apr 25, 2011 at 2:33 PM, Ted Dunning <[email protected]> wrote:

> There are two meanings of top floating around in this thread.
>
> I think that the OP was asking for "find the indexes of the largest
> elements
> of a vector".  That is definitely what Sean answered.
>

That's definitely what I was talking about as well.

  -jake


>
> I think that Dmitriy mean "find the most common elements of a collection".
>  That is definitely harder and needs slight cleverness.
>
> On Mon, Apr 25, 2011 at 2:03 PM, Dmitriy Lyubimov <[email protected]>
> wrote:
>
> > if you "forget" some of the items you saw along with the counters,
> > how'd you add them back to the heap?
> >
> >
> > On Mon, Apr 25, 2011 at 1:08 PM, Jake Mannix <[email protected]>
> > wrote:
> > > On Mon, Apr 25, 2011 at 1:03 PM, Dmitriy Lyubimov <[email protected]>
> > wrote:
> > >
> > >> There's nothing in Mahout to do this (afaik).
> > >>
> > >> There's a statistical method for this doing that in linear time, which
> > >> is very easy to implement (and btw would make an excellent addition to
> > >> Mahout), google up 'countsketch'.
> > >>
> > >
> > > Taking top k items in a list of length n, when n >> k, is already O(n),
> > if
> > > you check the bottom of heap first before inserting.  Not sure if
> > > countsketch
> > > is necessary for the usual use case.
> > >
> > >  -jake
> > >
> >
>

Reply via email to