Re: Top items in a vector

Ted Dunning Mon, 25 Apr 2011 16:34:38 -0700

On Mon, Apr 25, 2011 at 4:15 PM, Jake Mannix <[email protected]> wrote:

> On Mon, Apr 25, 2011 at 2:45 PM, Ted Dunning <[email protected]>
> wrote:
>
> > The simplest implementation is O(n log(k)) (inserts every element, then
> > drops at most one element).
> >
> > Caching the k-th best element and only inserting new items has the same
> > worst case behavior in ordered cases.
> >
>
> why on earth would the current situation at hand ever have a fully ordered
> vector?
>

Pedantically speaking, that is.  If we talking worst case bounds, then the
worst case comes into the conversation though it be of little practical
import.

> This is a search problem, and the simple search solution works almost
> completely linearly.
>

Indeed.  I have never implemented anything else.  Note that the worst case
is still linear in n.  It just has the full k log k multiplier.  Normally
the multiplier is much less but it seems (after a full 10 seconds of
reflection) that you still have a total of O(n + log n log k) if you cache
the best element.  I say this because you get O(n) by simply examining each
element and then the probability of the j-th element seen making it into the
heap is k / j when j >= k.  This means that the total cost goes like log k
(sum_{j=1..n} k / j) = O(log n log k)

BUT

As Jake says, this is all really just O(n) and is so fast that none of this
matters to the user.

Re: Top items in a vector

Reply via email to