Re: [PATCHES] LIMIT/SORT optimization

Heikki Linnakangas Wed, 28 Mar 2007 01:02:15 -0800

Some comments on the patch below.

Gregory Stark wrote:

+ /* tuplesort_set_bound - External API to set a bound on a tuplesort
+  *
+  * Must be called before inserting any tuples.

++ * Sets a maximum number of tuples the caller is interested in. The first

+  * <bound> tuples are maintained using a simple insertion sort and returned
+  * normally. Any tuples that lie after those in the sorted result set are
+  * simply thrown out
+  */

The "Must be called before inserting any tuples" is in contradictionwith the comment in the header file:

+ /* This can be called at any time before performsort to advise tuplesort that
+  * only this many tuples are interesting. If that many tuples fit in memory 
and
+  * we haven't already overflowed to disk then tuplesort will switch to a 
simple
+  * insertion sort or heap sort and throw away the uninteresting tuples.
+  */


The latter seems to be correct.

! /*
!  * Convert the existing unordered list of sorttuples to a heap in either 
order.
!  * This used to be inline but now there are three separate places we heap sort
!  * (initializing the tapes, if we have a bounded output, and any time the user
!  * says he doesn't want to use glibc's qsort).
!  *
!  * NOTE heapify passes false for checkIndex (and stores a constant tupindex
!  * passed as a parameter) even though we use heaps for multi-run sources
!  * because we only heapify when we're doing in-memory sorts or in inittapes
!  * before there's any point in comparing tupindexes.
!  */

!! static void

! tuplesort_heapify(Tuplesortstate *state, int tupindex, HeapOrder heaporder)
! {

The comment claims that we use heap sort when the user says he doesn'twant to use glibc's qsort. I recall that we always use our own qsortimplementation nowadays. And we never used the heap sort as a qsortreplacement, did we?

In performsort, you convert the in-memory heap to a sorted list in onego. I wonder if it would be better to switch to a new TSS_ALLINHEAPstate that means "all tuples are now in the in-memory heap", and calltuplesort_heap_siftup in gettuple. It probably doesn't make muchdifference in most cases, but if there's another limit node in the planwith a smaller limit or the client only fetches a few top rows with acursor you'd avoid unheapifying tuples that are just thrown away later.

There's a few blocks of code surrounded with "#if 0 - #endif". Are thosejust leftovers that should be removed, or are things that still need tofinished and enabled?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Re: [PATCHES] LIMIT/SORT optimization

Reply via email to