On 13 April 2012 18:33, Robert Haas <robertmh...@gmail.com> wrote: > We do use insertion sort for partitions of size < 7. I assume you are > referring to something else.
I stand corrected. My confusion came from the removal of a later *switch* to insertion sort in a3f0b3d68f9a5357a3f72b40a45bcc714a9e0649, which also added the pre-sorted check where you'd expect to see the insertion switch. Of course, the n < 7 insertion sort thing is right beside the added check. So another, redundant copy of insertion sort was removed, and not the one that almost every real-world quicksort implementation has. > Heap-sorting could also benefit from some optimization in this area. > Right now, if you heap-sort data that's already in order, it gets > progressively slower as you crank up work_mem, because the heap gets > deeper and so extraction and siftup get more expensive, for no real > benefit. We could do something like check, every time we use up our > available memory, whether the heap is already entirely in order. If > so, we could dump all but one tuple to the current run and the start > reading tuples again. Or maybe dump some percentage of the array, > though that might require a bit more bookkeeping. Or maybe there's a > smarter algorithm that would also cater to mostly-sorted data. Well, timsort is specifically designed to take advantage of pre-sorted data. It does appear to have a lot of traction, as wikipedia points out: "Timsort has been Python's standard sorting algorithm since version 2.3. It is now also used to sort arrays in Java SE 7,[2] and on the Android platform.[3] " Somebody has produced a BSD-licensed C implementation that draws heavily on the Python/Java one here: http://code.google.com/p/timsort/source/browse/trunk/timSort.c?spec=svn17&r=17 It looks like it has exactly the same interface as a c stdlib qsort. So it'd be fairly easy to produce a timsort_arg() based on this, if anyone cares to. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers