Re: [HACKERS] Using quicksort and a merge step to significantly improve on tuplesort's single run "external sort"

Heikki Linnakangas Fri, 31 Jul 2015 01:00:12 -0700

On 07/31/2015 02:01 AM, Peter Geoghegan wrote:

What prevents the tuple at the top of the in-memory heap at the point
of tuplesort_performsort() (say, one of the ones added to the heap as
our glut of memory was*partially*  consumed) being less than the
last/greatest tuple on tape? If the answer is "nothing", a merge step
is clearly required.

Oh, ok, I was confused on how the heap works. You could still abstractthis as "in-memory tails" of the tapes, but it's more complicated than Ithought at first:

When it's time to drain the heap, in performsort, divide the array intotwo arrays, based on the run number of each tuple, and then quicksortthe arrays separately. The first array becomes the in-memory tail of thecurrent tape, and the second array becomes the in-memory tail of thenext tape.

You wouldn't want to actually allocate two arrays and copy SortTuplesaround, but keep using the single large array, just logically dividedinto two. So the bookkeeping isn't trivial, but seems doable.

Hmm, I can see another possible optimization here, in the way the heapis managed in TSS_BUILDRUNS state. Instead of keeping a single heap,with tupindex as the leading key, it would be more cache efficient tokeep one heap for the currentRun, and an unsorted array of tuplesbelonging to currentRun + 1. When the heap becomes empty, and currentRunis incemented, quicksort the unsorted array to become the new heap.

That's a completely separate idea from your patch, although if you didit that way, you wouldn't need the extra step to divide the large arrayinto two, as you'd maintain that division all the time.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using quicksort and a merge step to significantly improve on tuplesort's single run "external sort"

Reply via email to