Re: [PATCHES] WIP: further sorting speedup

Michael Glaesemann Mon, 20 Feb 2006 19:05:32 -0800


On Feb 21, 2006, at 3:45 , Simon Riggs wrote:

On Sun, 2006-02-19 at 21:40 -0500, Tom Lane wrote:
After applying Simon's recent sort patch, I was doing someprofiling andnoticed that sorting spends an unreasonably large fraction of itstime
extracting datums from tuples (heap_getattr or index_getattr).  The
attached patch does something about this by pulling out theleading sortcolumn of a tuple when it is received by the sort code or re-readfrom a
"tape".


<snip />

The choice to pull out just the leading column, rather than allcolumns,

is driven by concerns of (a) code complexity and (b) memory space.
Having the extra columns pre-extracted wouldn't buy anything anyway
in the common case where the leading key determines the result of
a comparison.


<snip />

I agree that as long as we are swamped by the cost of heapgetattr,then
it does seem likely that first-key extraction (and keeping it with the
tuple itself) will be a win in most cases over full-key extraction.

Most of this is way above my head, but I'm trying to follow along:when you say first key and full key, are these related to relationkeys (e.g., primary key) or attributes that are used in sorting(regardless of whether they're a key or not)? I notice Tom used theterm "leading [sort] column", which I read to mean the firstattribute used to sort the relation (for whichever purpose, e.g.,mergejoins, order-by clauses). I'll see if I can't find the Nybergpaper as well to learn a bit more. (I haven't been sleeping wellrecently.)


Michael Glaesemann
grzm myrealbox com




---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Re: [PATCHES] WIP: further sorting speedup

Reply via email to