Steven D'Aprano wrote:
On Sun, 31 May 2009 12:08:33 +0100, Arnaud Delobelle wrote:

Anyway, it's good to know that quicksort is O(n^2) in the worst case -
and that this worst case can crop up very easily in some situations,
especially if not too much care has been taken implementing it.

The naive quicksort algorithm, where you recursively split the list into two halves, performs badly if the list is already sorted (or nearly so). It's easy to fix that: randomise the list before you sort! You can do that with a single pass of the list. That has the advantage of ensuring that no specific input can cause degraded performance, so an attacker can't DOS your application by passing sorted lists to be sorted.

Another strategy is to randomly exchange the pivot element when sorting. This avoids having to do a separate pass over the list, while still ensuring that no particular input can cause degraded performance.

A third strategy is to use a hybrid insertion/quicksort function. This is probably a good idea regardless, because for small lists, the overhead of quicksort generally makes insertion sort faster anyway. (CPython's sort used to be similar: it was a hybrid insertion/samplesort until, by memory, version 2.2, when it was changed for Timsort.

Which is hybrid binary insertion sort and mergesort, with b.i.s used for sublists less than 64 items.


--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to