Re: [HACKERS] Memory usage during sorting

Tom Lane Sun, 26 Feb 2012 19:38:16 -0800

Robert Haas <[email protected]> writes:
> A quick Google search for external sorting algorithms suggest that the
> typical way of doing an external sort is to read data until you fill
> your in-memory buffer, quicksort it, and dump it out as a run.  Repeat
> until end-of-data; then, merge the runs (either in a single pass, or
> if there are too many, in multiple passes).  I'm not sure whether that
> would be better than what we're doing now, but there seem to be enough
> other people doing it that we might want to try it out.  Our current
> algorithm is to build a heap, bounded in size by work_mem, and dribble
> tuples in and out, but maintaining that heap is pretty expensive;
> there's a reason people use quicksort rather than heapsort for
> in-memory sorting.


Well, the reason the heapsort approach is desirable here is that you end
up with about half as many runs to merge, because the typical run length
is significantly more than what will fit in work_mem.  Don't get too
excited about micro-optimization if you haven't understood the larger
context.

                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Memory usage during sorting

Reply via email to