Virag Dhulia wrote:
> What could be the most efficient algo for finding the Nth highst element in
> an M-Element list where N<M always??
There is an algorithm called the "select" algorithm, that executes in
O(M) time. Its a little remarkable. Roughly speaking it works as
follows:
0. If the array is less than 5 elements, just do the thing brute force
by sorting the array, then picking out the Mth element, and return.
Otherwise perform the steps 1 through 5 below.
1. Split the array into four sequential pieces of length ceil(M/5) and
a fifth with the remaining elements. Call these arrays a[0], a[1], ...
a[4]. So the first element of the original array is a[0][0].
2. For each index i, sort the elements { a[0][i], a[1][i], a[2][i],
a[3][i], a[4][i] } in place. Actually you only need to sort them
sufficient such that a[2][i] is definately the median element. If
a[4][i] does not exist (a[4] might be a smaller array) then sort {
a[0][i], a[1][i], a[2][i], a[3][i] }. By "in-place", I mean, the
results should be stored back into the positions where they came from.
3. Recursively call this whole algorithm on that a[2] array to find its
exact median element (so, the recursed N will be ceil(ceil(M/5)/2) ).
Suppose this element is a[2][k].
4. Now perform a partitioning operation over the whole original with
a[2][k] being the pivot value, and take note of the new pivot position
p. (If you don't know what partitioning is, look up the Quicksort
algorithm.)
5. If N < p, then recursively call this algorithm on the first p
elements. If N > p then recursively call this algorithm on the
elements after p, with N set to N-p If N = p, then simply return with
the pivot value.
Ok, so the question is, why does this work? Essentially step 3 we have
found an element in the array that is definately above the first 1/5 of
the array, and below the last 1/5 (rounded down) of the array. That is
to say, the element is in the middle 3/5 of the array (or just 1
position beyond). That means that after each recursive step at least
1/5 of the array being considered can be discarded. So by induction,
this means that the array sizes that are being considered will
eventually be no more than 5 elements in which case they are dealt with
in step 0.
To check the assumption that this runs in O(M) time, notice that the
steps leading up to the recursion in step 5, are no more than O(M). So
taking the recursion into account we see that this executes in:
O(M) + 4/5* (O(M) + 4/5* (O(M) + 4/5 * (O(M) + ...)))
time. Which is 5*O(M), which is still O(M).
--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Algorithm Geeks" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~----------~----~----~----~------~----~------~--~---