On Sat, Oct 01, 2005 at 11:26:07PM -0400, Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > Anyway, to bring some real info I just profiled PostgreSQL 8.1beta
> > doing an index create on a 2960296 row table (3 columns, table size
> > 317MB).
> 
> 3 columns in the index you mean?  What were the column datatypes?
> Any null values?

Nope, three columns in the table, one column in the index, no nulls.
The indexed column was integer. I did it once with around 6500 values
repeated over and over, lots of duplicate kays. And once on a serial
column but it made no descernable difference either way. Although the
comparison function was called less (only 76 million times), presumably
because it was mostly sorted already.

> > The number 1 bottleneck with 41% of user time is comparetup_index.
> > ...
> > The thing is, I can't see anything in comparetup_index() that could
> > take much time.
> 
> The index_getattr and heap_getattr macros can be annoyingly expensive.

And yet they are optimised for the common case. nocache_index_getattr
was only called 7 times, which is about what you expect. I'm getting
annotated output now, to determine which line takes the time...
Actually, my previous profile overstated stuff a bit. Profiling turned
off optimisation so I put it back and you get better results but the
order doesn't change much. By line results are below.

The top two are the index_getattr calls in comparetup_index. Third and
fourth are the HEAPCOMPARES in tuplesort_heap_siftup. Then comes the
inlineApplySortFunction call (which isn't being inlined, despite
suggesting it should be, -Winline warns about this).

Looks to me that there are no real gains to be made in this function.
What is needed is an algorithmic change to call this function less
often...

Have a nice weekend,

  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
  9.40     22.56    22.56                             comparetup_index 
(tuplesort.c:2042 @ 8251060)
  5.07     34.73    12.17                             comparetup_index 
(tuplesort.c:2043 @ 82510c0)
  4.73     46.09    11.36                             tuplesort_heap_siftup 
(tuplesort.c:1648 @ 825074d)
  3.48     54.45     8.36                             tuplesort_heap_siftup 
(tuplesort.c:1661 @ 82507a9)
  2.80     61.18     6.73                             comparetup_index 
(tuplesort.c:2102 @ 8251201)
  2.68     67.62     6.44                             comparetup_index 
(tuplesort.c:2048 @ 8251120)
  2.16     72.82     5.20                             tuplesort_heap_siftup 
(tuplesort.c:1652 @ 825076d)
  1.88     77.34     4.52 76025782     0.00     0.00  comparetup_index 
(tuplesort.c:2016 @ 8251010)
  1.82     81.70     4.36 76025782     0.00     0.00  inlineApplySortFunction 
(tuplesort.c:1833 @ 8251800)
  1.73     85.85     4.15                             readtup_heap 
(tuplesort.c:2000 @ 8250fd8)
  1.67     89.86     4.01                             AllocSetAlloc (aset.c:568 
@ 824bec0)
  1.61     93.72     3.86                             comparetup_index 
(tuplesort.c:2025 @ 825102f)
  1.47     97.25     3.53 76025785     0.00     0.00  btint4cmp 
(nbtcompare.c:74 @ 80924a0)
  1.11     99.92     2.67                             readtup_datum 
(tuplesort.c:2224 @ 82517c4)
  1.10    102.55     2.64                             comparetup_index 
(tuplesort.c:2103 @ 82511e7)

  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
 28.34     68.01    68.01 76025782     0.00     0.00  comparetup_index
 13.56    100.54    32.53  7148934     0.00     0.00  tuplesort_heap_siftup
  8.66    121.33    20.79 76025782     0.00     0.00  inlineApplySortFunction
  4.43    131.96    10.63 13084567     0.00     0.00  AllocSetAlloc
  3.73    140.90     8.94 76025785     0.00     0.00  btint4cmp
  2.15    146.07     5.17  6095625     0.00     0.00  LWLockAcquire
  2.02    150.92     4.85  2960305     0.00     0.00  heapgettup
  1.98    155.66     4.74  7148934     0.00     0.00  tuplesort_heap_insert
  1.78    159.94     4.28  2960312     0.00     0.00  slot_deform_tuple
  1.73    164.09     4.15                             readtup_heap
  1.67    168.09     4.00  6095642     0.00     0.00  LWLockRelease
  1.53    171.76     3.68  2960308     0.00     0.00  index_form_tuple
  1.44    175.21     3.45 13083442     0.00     0.00  AllocSetFree
  1.28    178.28     3.07  8377285     0.00     0.00  LogicalTapeWrite
  1.25    181.29     3.01  8377285     0.00     0.00  LogicalTapeRead
  1.11    183.96     2.67                             readtup_datum
  1.06    186.51     2.55        1     2.55   123.54  IndexBuildHeapScan

-- 
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment: pgphyErRrkGN7.pgp
Description: PGP signature

Reply via email to