On Feb 21, 2006, at 3:45 , Simon Riggs wrote:

On Sun, 2006-02-19 at 21:40 -0500, Tom Lane wrote:
After applying Simon's recent sort patch, I was doing some profiling and noticed that sorting spends an unreasonably large fraction of its time
extracting datums from tuples (heap_getattr or index_getattr).  The
attached patch does something about this by pulling out the leading sort column of a tuple when it is received by the sort code or re-read from a
"tape".

<snip />

The choice to pull out just the leading column, rather than all columns,
is driven by concerns of (a) code complexity and (b) memory space.
Having the extra columns pre-extracted wouldn't buy anything anyway
in the common case where the leading key determines the result of
a comparison.

<snip />

I agree that as long as we are swamped by the cost of heapgetattr, then
it does seem likely that first-key extraction (and keeping it with the
tuple itself) will be a win in most cases over full-key extraction.

Most of this is way above my head, but I'm trying to follow along: when you say first key and full key, are these related to relation keys (e.g., primary key) or attributes that are used in sorting (regardless of whether they're a key or not)? I notice Tom used the term "leading [sort] column", which I read to mean the first attribute used to sort the relation (for whichever purpose, e.g., mergejoins, order-by clauses). I'll see if I can't find the Nyberg paper as well to learn a bit more. (I haven't been sleeping well recently.)

Michael Glaesemann
grzm myrealbox com




---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Reply via email to