On 06/02/14 18:21, Jeff Janes wrote:
How big of sets were you sorting in each case?
Big enough to go external. The timings and compare-counts given are purely for the heapify stage not the total for the sort, so are constrained by the work_mem not by the sort size per se. I'm limited to working on a small machine, so the work_mem value of 1e+6 is approaching my top limit of sort-time pain unless I rip the heapify out into a test harness. What range of work_mem is it useful to test over?
Was it always just slightly bigger than would fit in work_mem?
I didn't check.
Did you try sorting already-sorted, reverse sorted, or pipe-organ shaped data sets?
Not yet, but I can. What variety of pipe-organ shape is of interest (I can think of several that might be called that)?
We will also need to test it on strings. I usually use md5(random()::text) to generate strings for such purposes, at least for a first pass.
OK.
The name of the attached patch is a bit unfortunate.
Is there a project standard for this?
And I'm not sure what you are doing with tupindex, the change there doesn't seem to be necessary to the rest of your work, so some explanation on that would be helpful.
I'll add commentary.
The project coding style usually has comments in blocks before loops on which they comment, rather than interspersed throughout the clauses of the "for" statement.
I'll adjust that.
Another optimization that is possible is to do only one comparison at each level (between the two children) all the way down the heap, and then compare the parent to the chain of smallest children in reverse order. This can substantially reduce the number of comparisons on average, because most tuples sift most of the way down the heap (because most of the tuples are at the bottom of the heap).
Sounds interesting; I'll see if I can get that going.
(Also, I think you should make your siftdown code look more like the existing siftdown code.)
A function called by the outer loop? Can do; the existing does that because the function is called from multiple places but this will only be used the once. Thanks for the review. -- Jeremy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers