Christopher Seiwald writes:
> The FreeBSD qsort implementation has a rather nasty degenerate case.
> If you have data that partitions exactly about the median pivot, yet
> which is unsorted in either partition, you get get treated to an insertion
> sort.  Example:
> 
>       1 2 3 4 5 6 7 8 9 10 19 18 17 16 14 14 13 12 11
> 
> qsort picks 10 as the median, does no swaps on its first pass, and so
> decides to switch to an insertion sort.   The idea is that if no swaps
> occur, the data is likely to be nearly already sorted and a good candidate
> for insertion sort.  This turns out to be a (very) bad idea if you have
> some unsorted data buried in the middle of a (very) large dataset.
> 
> The implementation claims to come from Bentley and McIlroy's paper,
> "Engineering a Sort Function," and this is largely true, but the insertion
> sort optimization(?) isn't in that paper.  The GNU qsort.c only does an
> insertion sort if it is below a certain threshhold in size, and so never
> attempts such a sort on the whole dataset.  (The GNU qsort does a number
> of other things pooh-poohed in the Bentley paper.)
> 
> It's a pretty straightforward change to bypass the insertion sort for
> large subsets of the data.  If no one has a strong love for qsort, I'll
> educate myself on how to make and contribute this change.

Sounds good to me.. let's fix it.

-Archie

___________________________________________________________________________
Archie Cobbs   *   Whistle Communications, Inc.  *   http://www.whistle.com


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message

Reply via email to