On Mon, Apr 25, 2011 at 4:15 PM, Jake Mannix <[email protected]> wrote:
> On Mon, Apr 25, 2011 at 2:45 PM, Ted Dunning <[email protected]> > wrote: > > > The simplest implementation is O(n log(k)) (inserts every element, then > > drops at most one element). > > > > Caching the k-th best element and only inserting new items has the same > > worst case behavior in ordered cases. > > > > why on earth would the current situation at hand ever have a fully ordered > vector? > Pedantically speaking, that is. If we talking worst case bounds, then the worst case comes into the conversation though it be of little practical import. > This is a search problem, and the simple search solution works almost > completely linearly. > Indeed. I have never implemented anything else. Note that the worst case is still linear in n. It just has the full k log k multiplier. Normally the multiplier is much less but it seems (after a full 10 seconds of reflection) that you still have a total of O(n + log n log k) if you cache the best element. I say this because you get O(n) by simply examining each element and then the probability of the j-th element seen making it into the heap is k / j when j >= k. This means that the total cost goes like log k (sum_{j=1..n} k / j) = O(log n log k) BUT As Jake says, this is all really just O(n) and is so fast that none of this matters to the user.
