subject:"\\\[HACKERS\\\] Minor performance improvement in transition to external sort"

Re: [HACKERS] Minor performance improvement in transition to external sort

2014-04-17 Thread Robert Haas

On Wed, Apr 16, 2014 at 7:38 PM, Bruce Momjian br...@momjian.us wrote:
 On Thu, Apr 10, 2014 at 06:03:15PM +0100, Simon Riggs wrote:
 On 6 February 2014 18:21, Jeff Janes jeff.ja...@gmail.com wrote:
  On Tue, Feb 4, 2014 at 2:22 PM, Jeremy Harris j...@wizmail.org wrote:
 
  The attached patch replaces the existing siftup method for heapify with
  a siftdown method. Tested with random integers it does 18% fewer
  compares and takes 10% less time for the heapify, over the work_mem
  range 1024 to 1048576.
 
 
  Thanks for working on this.

 +1

 Your patch isn't linked properly from the CF manager though.

 If you like patches like this then there's a long(er) list of
 optimizations already proposed previously around sorting. It would be
 good to have someone work through them for external sorts. I believe
 Noah is working on parallel internal sort (as an aside).

 There's also an optimization possible for merge joins where we use the
 output of the first sort as an additional filter on the second sort.
 That can help when we're going to join two disjoint tables.

 Where should this be recorded?  TODO?  Commitfest manager?

IIUC, the original patch was withdrawn; any remaining action items
should probably go to TODO.  I'm not sure which specific idea you're
referring to, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor performance improvement in transition to external sort

2014-04-16 Thread Bruce Momjian

On Thu, Apr 10, 2014 at 06:03:15PM +0100, Simon Riggs wrote:
 On 6 February 2014 18:21, Jeff Janes jeff.ja...@gmail.com wrote:
  On Tue, Feb 4, 2014 at 2:22 PM, Jeremy Harris j...@wizmail.org wrote:
 
  The attached patch replaces the existing siftup method for heapify with
  a siftdown method. Tested with random integers it does 18% fewer
  compares and takes 10% less time for the heapify, over the work_mem
  range 1024 to 1048576.
 
 
  Thanks for working on this.
 
 +1
 
 Your patch isn't linked properly from the CF manager though.
 
 If you like patches like this then there's a long(er) list of
 optimizations already proposed previously around sorting. It would be
 good to have someone work through them for external sorts. I believe
 Noah is working on parallel internal sort (as an aside).
 
 There's also an optimization possible for merge joins where we use the
 output of the first sort as an additional filter on the second sort.
 That can help when we're going to join two disjoint tables.

Where should this be recorded?  TODO?  Commitfest manager?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor performance improvement in transition to external sort

2014-04-10 Thread Simon Riggs

On 6 February 2014 18:21, Jeff Janes jeff.ja...@gmail.com wrote:
 On Tue, Feb 4, 2014 at 2:22 PM, Jeremy Harris j...@wizmail.org wrote:

 The attached patch replaces the existing siftup method for heapify with
 a siftdown method. Tested with random integers it does 18% fewer
 compares and takes 10% less time for the heapify, over the work_mem
 range 1024 to 1048576.


 Thanks for working on this.

+1

Your patch isn't linked properly from the CF manager though.

If you like patches like this then there's a long(er) list of
optimizations already proposed previously around sorting. It would be
good to have someone work through them for external sorts. I believe
Noah is working on parallel internal sort (as an aside).

There's also an optimization possible for merge joins where we use the
output of the first sort as an additional filter on the second sort.
That can help when we're going to join two disjoint tables.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor performance improvement in transition to external sort

2014-02-26 Thread Jeremy Harris


On 26/02/14 03:08, Robert Haas wrote:

On Tue, Feb 25, 2014 at 2:55 PM, Jeremy Harris j...@wizmail.org wrote:

On 24/02/14 17:38, Robert Haas wrote:


On Thu, Feb 20, 2014 at 7:27 PM, Jeremy Harris j...@wizmail.org wrote:


Run under cachegrind, it takes about N/10 last-level cache misses,
all for the new item being introduced to the heap.  The existing
code takes none at all.



Can you explain this further?  This seems like an important clue that
could be useful when trying to optimize this code, but I'm a little
unclear which part of the operation has more cache misses with your
changes and why.



In the patched version, for the heapify operation the outer loop
starts at the last heap-parent tuple and works backward to the
start of the tuples array.  A copy is taken of the SortTuple being
operated on for the inner loop to use.  This copy suffers cache misses.

(The inner loop operates on elements between the copy source and the
end of the array).


In the original, the outer loop runs the array in increasing index
order.  Again a copy is taken of the SortTuple for the inner loop
to use.  This copy does not appear to take significant cache misses.

(The inner loop operates on elements between the copy source and
the start of the array).


Do you have any theory as to why this happens in one case but not the other?


Nope.  Only really wild stuff requiring the cpu memory channel
recognising the forward scan (despite the inner loop) and not
the backward one - and ignoring prefetch instructions, which I
experimented with.
--
Cheers,
   Jeremy




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor performance improvement in transition to external sort

2014-02-25 Thread Jeremy Harris


On 24/02/14 17:38, Robert Haas wrote:

On Thu, Feb 20, 2014 at 7:27 PM, Jeremy Harris j...@wizmail.org wrote:

Run under cachegrind, it takes about N/10 last-level cache misses,
all for the new item being introduced to the heap.  The existing
code takes none at all.


Can you explain this further?  This seems like an important clue that
could be useful when trying to optimize this code, but I'm a little
unclear which part of the operation has more cache misses with your
changes and why.


In the patched version, for the heapify operation the outer loop
starts at the last heap-parent tuple and works backward to the
start of the tuples array.  A copy is taken of the SortTuple being
operated on for the inner loop to use.  This copy suffers cache misses.

(The inner loop operates on elements between the copy source and the
end of the array).


In the original, the outer loop runs the array in increasing index
order.  Again a copy is taken of the SortTuple for the inner loop
to use.  This copy does not appear to take significant cache misses.

(The inner loop operates on elements between the copy source and
the start of the array).

--
Cheers,
   Jeremy



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor performance improvement in transition to external sort

2014-02-25 Thread Robert Haas

On Tue, Feb 25, 2014 at 2:55 PM, Jeremy Harris j...@wizmail.org wrote:
 On 24/02/14 17:38, Robert Haas wrote:

 On Thu, Feb 20, 2014 at 7:27 PM, Jeremy Harris j...@wizmail.org wrote:

 Run under cachegrind, it takes about N/10 last-level cache misses,
 all for the new item being introduced to the heap.  The existing
 code takes none at all.


 Can you explain this further?  This seems like an important clue that
 could be useful when trying to optimize this code, but I'm a little
 unclear which part of the operation has more cache misses with your
 changes and why.


 In the patched version, for the heapify operation the outer loop
 starts at the last heap-parent tuple and works backward to the
 start of the tuples array.  A copy is taken of the SortTuple being
 operated on for the inner loop to use.  This copy suffers cache misses.

 (The inner loop operates on elements between the copy source and the
 end of the array).


 In the original, the outer loop runs the array in increasing index
 order.  Again a copy is taken of the SortTuple for the inner loop
 to use.  This copy does not appear to take significant cache misses.

 (The inner loop operates on elements between the copy source and
 the start of the array).

Do you have any theory as to why this happens in one case but not the other?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor performance improvement in transition to external sort

2014-02-24 Thread Robert Haas

On Thu, Feb 20, 2014 at 7:27 PM, Jeremy Harris j...@wizmail.org wrote:
 On 09/02/14 17:11, Jeremy Harris wrote:
 On 06/02/14 18:21, Jeff Janes wrote:
   Did you try sorting already-sorted, reverse
 sorted, or pipe-organ shaped data sets?  We will also need to test it on
 strings.  I usually use md5(random()::text) to generate strings for such
 purposes, at least for a first pass.


 Attached is version 2 of the patch, which fixes the performance on
 constant-input.

 Having beaten on this some more I'm prepared to abandon it.

 The wallclock time, for random input, drifts up at larger N
 (compared to the existing code) despite the number of comparisons
 being consistently less.

 Run under cachegrind, it takes about N/10 last-level cache misses,
 all for the new item being introduced to the heap.  The existing
 code takes none at all.

Can you explain this further?  This seems like an important clue that
could be useful when trying to optimize this code, but I'm a little
unclear which part of the operation has more cache misses with your
changes and why.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor performance improvement in transition to external sort

2014-02-20 Thread Jeremy Harris


On 09/02/14 17:11, Jeremy Harris wrote:

On 06/02/14 18:21, Jeff Janes wrote:

  Did you try sorting already-sorted, reverse
sorted, or pipe-organ shaped data sets?  We will also need to test it on
strings.  I usually use md5(random()::text) to generate strings for such
purposes, at least for a first pass.


Attached is version 2 of the patch, which fixes the performance on
constant-input.


Having beaten on this some more I'm prepared to abandon it.

The wallclock time, for random input, drifts up at larger N
(compared to the existing code) despite the number of comparisons
being consistently less.

Run under cachegrind, it takes about N/10 last-level cache misses,
all for the new item being introduced to the heap.  The existing
code takes none at all.


It might be worthwhile for a seriously expensive comparison function;
say, more than 50 clocks.  For integers and md5-strings it isn't.
--
Cheers,
  Jeremy




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor performance improvement in transition to external sort

2014-02-09 Thread Robert Haas

On Fri, Feb 7, 2014 at 4:28 PM, Jeremy Harris j...@wizmail.org wrote:
 On 06/02/14 22:12, Jeremy Harris wrote:
  Did you try sorting already-sorted, reverse
 sorted, or pipe-organ shaped data sets?

 Summary (low numbers better):

 Random ints: 83% compares, level on time.
 Sorted ints: level compares, 70% time.
 Reverse-sorted ints: 10% compares, 15% time  (!)
 Constant ints:   200% compares, 360% time(ouch, and not O(n))
 Pipe-organ ints: 80% compares, 107% time
 Random text: 83% compares, 106% time

This is kind of what I expected to happen.  The problem is that it's
hard to make some cases better without making other cases worse.  I
suspect (hope?) there's some simple fix for the constant-int case.
But the last two cases are trickier.  It seems intuitively that
reducing comparisons ought to reduce runtime, but if I'm reading the
results, the runtime actually went up even though the number of
comparisons went down.  This is hard to account for, but we probably
need to at least understand why that's happening.  I feel like this
algorithm ought to be a win, but these data don't provide a compelling
case for change.

I wonder if it would be worth trying this test with text data as well.
 Text comparisons are much more expensive than integer comparisons, so
the effect of saving comparisons ought to be more visible there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor performance improvement in transition to external sort

2014-02-07 Thread Jeremy Harris


On 06/02/14 22:12, Jeremy Harris wrote:

 Did you try sorting already-sorted, reverse
sorted, or pipe-organ shaped data sets?


Summary (low numbers better):

Random ints: 83% compares, level on time.
Sorted ints: level compares, 70% time.
Reverse-sorted ints: 10% compares, 15% time  (!)
Constant ints:   200% compares, 360% time(ouch, and not O(n))
Pipe-organ ints: 80% compares, 107% time
Random text: 83% compares, 106% time

--
Cheers,
  Jeremy


siftdown_performance.ods
Description: application/vnd.oasis.opendocument.spreadsheet

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor performance improvement in transition to external sort

2014-02-06 Thread Jeremy Harris


On 05/02/14 21:56, Robert Haas wrote:

On Tue, Feb 4, 2014 at 5:22 PM, Jeremy Harris j...@wizmail.org wrote:

The attached patch replaces the existing siftup method for heapify with
a siftdown method. Tested with random integers it does 18% fewer
compares and takes 10% less time for the heapify, over the work_mem
range 1024 to 1048576.

Both algorithms appear to be O(n) (contradicting Wikipedia's claim
that a siftup heapify is O(n log n)).


I think Wikipedia is right.  Inserting an a tuple into a heap is O(lg
n); doing that n times is therefore O(n lg n).  Your patch likewise
implements an outer loop which runs O(n) times and an inner loop which
runs at most O(lg n) times, so a naive analysis of that algorithm also
comes out to O(n lg n).


The patch implements a siftdown.  Measurements attached.
--
Cheers,
  Jeremy


work_mem	baseline		Sift-down			baseline K	siftdown K		K ratio		baseline K	siftdown K		K ratio
	cmps	time	cmps	time		cmps	cmps		cmps		time	time		time
1024	27271	0.001	22426	0.001		26.6	21.9		82%		6.76E-07	4.93E-07		73%
2048	52153	0.001	44557	0.001		25.5	21.8		85%		6.28E-07	4.47E-07		71%
4096	108660	0.003	89591	0.002		26.5	21.9		82%		6.47E-07	4.63E-07		72%
8192	217341	0.005	179191	0.004		26.5	21.9		82%		6.57E-07	4.88E-07		74%
16384	434101	0.011	358680	0.009		26.5	21.9		83%		6.55E-07	5.42E-07		83%
32768	871110	0.022	717542	0.019		26.6	21.9		82%		6.60E-07	5.74E-07		87%
65536	1740355	0.043	1434967	0.039		26.6	21.9		82%		6.59E-07	5.96E-07		90%
131072	3478497	0.087	2869190	0.078		26.5	21.9		82%		6.62E-07	5.98E-07		90%
262144	6962693	0.173	5740518	0.154		26.6	21.9		82%		6.58E-07	5.89E-07		89%
524288	13912236	0.345	11476660	0.311		26.5	21.9		82%		6.59E-07	5.93E-07		90%
1048576	27839260	0.690	22957368	0.622		26.5	21.9		82%		6.58E-07	5.93E-07		90%

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor performance improvement in transition to external sort

2014-02-06 Thread Robert Haas

On Thu, Feb 6, 2014 at 4:22 AM, Jeremy Harris j...@wizmail.org wrote:
 On 05/02/14 21:56, Robert Haas wrote:
 On Tue, Feb 4, 2014 at 5:22 PM, Jeremy Harris j...@wizmail.org wrote:
 The attached patch replaces the existing siftup method for heapify with
 a siftdown method. Tested with random integers it does 18% fewer
 compares and takes 10% less time for the heapify, over the work_mem
 range 1024 to 1048576.

 Both algorithms appear to be O(n) (contradicting Wikipedia's claim
 that a siftup heapify is O(n log n)).


 I think Wikipedia is right.  Inserting an a tuple into a heap is O(lg
 n); doing that n times is therefore O(n lg n).  Your patch likewise
 implements an outer loop which runs O(n) times and an inner loop which
 runs at most O(lg n) times, so a naive analysis of that algorithm also
 comes out to O(n lg n).

 The patch implements a siftdown.  Measurements attached.

Uh, this spreadsheet is useless.  You haven't explained how these
numbers were generated, or what the columns in the spreadsheet
actually mean.  Ideally, you should include enough information that
someone else can try to reproduce your results.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor performance improvement in transition to external sort

2014-02-06 Thread Jeff Janes

On Tue, Feb 4, 2014 at 2:22 PM, Jeremy Harris j...@wizmail.org wrote:

 The attached patch replaces the existing siftup method for heapify with
 a siftdown method. Tested with random integers it does 18% fewer
 compares and takes 10% less time for the heapify, over the work_mem
 range 1024 to 1048576.


Thanks for working on this.  Several times I've wondered why we didn't use
siftdown for heapifying, as it is a well known optimization.  How big of
sets were you sorting in each case?  Was it always just slightly bigger
than would fit in work_mem?  Did you try sorting already-sorted, reverse
sorted, or pipe-organ shaped data sets?  We will also need to test it on
strings.  I usually use md5(random()::text) to generate strings for such
purposes, at least for a first pass.

The name of the attached patch is a bit unfortunate.  And I'm not sure what
you are doing with tupindex, the change there doesn't seem to be necessary
to the rest of your work, so some explanation on that would be helpful.

The project coding style usually has comments in blocks before loops on
which they comment, rather than interspersed throughout the clauses of the
for statement.



 Both algorithms appear to be O(n) (contradicting Wikipedia's claim
 that a siftup heapify is O(n log n)).


Siftdown is linear in all cases.  Siftup may be linear in the typical case,
but is n log n in the worst case.

Another optimization that is possible is to do only one comparison at each
level (between the two children) all the way down the heap, and then
compare the parent to the chain of smallest children in reverse order.
 This can substantially reduce the number of comparisons on average,
because most tuples sift most of the way down the heap (because most of the
tuples are at the bottom of the heap).  Robert submitted a patch to do this
in the main siftdown routine (which for some reason is
named tuplesort_heap_siftup, rather than siftdown), and I found it a
substantial improvement but it never got pushed forward.  I think Robert
was concerned it might make some obscure cases worse rather than better,
and no one put it through enough serious testing to overcome that concern.

(Also, I think you should make your siftdown code look more like the
existing siftdown code.)

Cheers,

Jeff

Re: [HACKERS] Minor performance improvement in transition to external sort

2014-02-06 Thread Jeremy Harris


On 06/02/14 18:21, Jeff Janes wrote:

  How big of
sets were you sorting in each case?


Big enough to go external.  The timings and compare-counts given are
purely for the heapify stage not the total for the sort, so are
constrained by the work_mem not by the sort size per se.

I'm limited to working on a small machine, so the work_mem value
of 1e+6 is approaching my top limit of sort-time pain unless I rip the
heapify out into a test harness.  What range of work_mem is it useful
to test over?


 Was it always just slightly bigger
than would fit in work_mem?


I didn't check.


 Did you try sorting already-sorted, reverse
sorted, or pipe-organ shaped data sets?


Not yet, but I can.  What variety of pipe-organ shape is of
interest (I can think of several that might be called that)?


 We will also need to test it on
strings.  I usually use md5(random()::text) to generate strings for such
purposes, at least for a first pass.


OK.



The name of the attached patch is a bit unfortunate.


Is there a project standard for this?


 And I'm not sure what
you are doing with tupindex, the change there doesn't seem to be necessary
to the rest of your work, so some explanation on that would be helpful.


I'll add commentary.


The project coding style usually has comments in blocks before loops on
which they comment, rather than interspersed throughout the clauses of the
for statement.


I'll adjust that.


Another optimization that is possible is to do only one comparison at each
level (between the two children) all the way down the heap, and then
compare the parent to the chain of smallest children in reverse order.
  This can substantially reduce the number of comparisons on average,
because most tuples sift most of the way down the heap (because most of the
tuples are at the bottom of the heap).


Sounds interesting; I'll see if I can get that going.


(Also, I think you should make your siftdown code look more like the
existing siftdown code.)


A function called by the outer loop?  Can do; the existing does that
because the function is called from multiple places but this will only
be used the once.

Thanks for the review.
--
Jeremy



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor performance improvement in transition to external sort

2014-02-06 Thread Jeremy Harris


On 06/02/14 14:22, Robert Haas wrote:

On Thu, Feb 6, 2014 at 4:22 AM, Jeremy Harris j...@wizmail.org wrote:

On 05/02/14 21:56, Robert Haas wrote:

On Tue, Feb 4, 2014 at 5:22 PM, Jeremy Harris j...@wizmail.org wrote:

The attached patch replaces the existing siftup method for heapify with
a siftdown method. Tested with random integers it does 18% fewer
compares and takes 10% less time for the heapify, over the work_mem
range 1024 to 1048576.

Both algorithms appear to be O(n) (contradicting Wikipedia's claim
that a siftup heapify is O(n log n)).



I think Wikipedia is right.  Inserting an a tuple into a heap is O(lg
n); doing that n times is therefore O(n lg n).  Your patch likewise
implements an outer loop which runs O(n) times and an inner loop which
runs at most O(lg n) times, so a naive analysis of that algorithm also
comes out to O(n lg n).


The patch implements a siftdown.  Measurements attached.


Uh, this spreadsheet is useless.  You haven't explained how these
numbers were generated, or what the columns in the spreadsheet
actually mean.  Ideally, you should include enough information that
someone else can try to reproduce your results.



I'm sorry I wasn't clear.

Test code was groups of the form:

set work_mem to 1024;
CREATE TABLE jgh (i integer);
INSERT INTO jgh (i) SELECT 2 * random() FROM generate_series(1, 2);
VACUUM ANALYZE jgh;
explain analyze SELECT * FROM jgh ORDER BY i;
set enable_siftdown_heapify to off;
explain analyze SELECT * FROM jgh ORDER BY i;
set enable_siftdown_heapify to on;
DROP TABLE jgh;

.. for the tabulated work_mem, and scaled values for the INSERT.

Compares were counted for the heapify stage (only), and timings
taken using pg_rusage_init() before and pg_rusage_show() after
it.  Times are in seconds.

Baseline value for compares and time used 858ec11858.
Sift-down compares and time are for the patch.

Baseline K cmps is compares divided by work_mem (which should
be a scaled version of compares-per-tuple being heapified) pre-patch.
Siftdown K cmps is likewise with the patch.

K ratio cmps is the ratio  (patched compares)/(unpatched compares).

Repeat for time measurements.

--
Cheers,
   Jeremy


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor performance improvement in transition to external sort

2014-02-05 Thread Robert Haas

On Tue, Feb 4, 2014 at 5:22 PM, Jeremy Harris j...@wizmail.org wrote:
 The attached patch replaces the existing siftup method for heapify with
 a siftdown method. Tested with random integers it does 18% fewer
 compares and takes 10% less time for the heapify, over the work_mem
 range 1024 to 1048576.

 Both algorithms appear to be O(n) (contradicting Wikipedia's claim
 that a siftup heapify is O(n log n)).

I think Wikipedia is right.  Inserting an a tuple into a heap is O(lg
n); doing that n times is therefore O(n lg n).  Your patch likewise
implements an outer loop which runs O(n) times and an inner loop which
runs at most O(lg n) times, so a naive analysis of that algorithm also
comes out to O(n lg n).  Wikipedia's article on
http://en.wikipedia.org/wiki/Heapsort explains why a tighter bound is
possible for the siftdown case.

I think I tested something like this at some point and concluded that
it didn't really help much, because building the initial heap was a
pretty small part of the runtime of a large sort.  It may still be
worth doing, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Minor performance improvement in transition to external sort

2014-02-04 Thread Jeremy Harris


The attached patch replaces the existing siftup method for heapify with
a siftdown method. Tested with random integers it does 18% fewer
compares and takes 10% less time for the heapify, over the work_mem
range 1024 to 1048576.

Both algorithms appear to be O(n) (contradicting Wikipedia's claim
that a siftup heapify is O(n log n)).

--
Cheers,
   Jeremy
diff --git a/src/backend/utils/sort/tuplesort.c 
b/src/backend/utils/sort/tuplesort.c
index 8b520c1..2ea7b2d 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -1211,7 +1211,8 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple)
(void) grow_memtuples(state);
Assert(state-memtupcount  state-memtupsize);
}
-   state-memtuples[state-memtupcount++] = *tuple;
+   state-memtuples[state-memtupcount] = *tuple;
+   state-memtuples[state-memtupcount++].tupindex = 0;
 
/*
 * Check if it's time to switch over to a bounded 
heapsort. We do
@@ -1884,23 +1885,47 @@ inittapes(Tuplesortstate *state)
state-tp_tapenum = (int *) palloc0(maxTapes * sizeof(int));
 
/*
-* Convert the unsorted contents of memtuples[] into a heap. Each tuple 
is
-* marked as belonging to run number zero.
-*
-* NOTE: we pass false for checkIndex since there's no point in 
comparing
-* indexes in this step, even though we do intend the indexes to be part
-* of the sort key...
+* Convert the unsorted contents of memtuples[] into a heap. Each tuple 
was
+* marked as belonging to run number zero on input; we don't compare 
that.
 */
ntuples = state-memtupcount;
-   state-memtupcount = 0; /* make the heap empty */
-   for (j = 0; j  ntuples; j++)
{
-   /* Must copy source tuple to avoid possible overwrite */
-   SortTuple   stup = state-memtuples[j];
+   SortTuple * m = state-memtuples;
 
-   tuplesort_heap_insert(state, stup, 0, false);
+   for (j = (ntuples-2)/2; /* comb the array from the last heap 
parent */
+j = 0;/* to the start */
+j--)
+   {
+   int root = j;
+   int child, swap = 0;
+   SortTuple ptup = m[root];
+
+   while ((child = root*2 + 1) = ntuples-1)
+   {   /* root has at least one child. Check 
left-child */
+   if (COMPARETUP(state, ptup, m[child])  0)
+   {   
/* [root]  [lchild] */
+   if (  ++child = ntuples-1  
/* check right-child */
+   COMPARETUP(state, ptup, 
m[child])  0)
+   swap = child;
+   else
+   break;  
/* [root] smallest */
+   }
+   else
+   {
+   swap = child;
+   if (  ++child = ntuples-1  
/* check right-child */
+   COMPARETUP(state, m[swap], 
m[child])  0)
+   swap = child;
+   }
+   /* [swap] is smallest of three; move into the 
parent slot */
+   m[root] = m[swap];
+   root = swap;/* and repeat 
with the child subtree */
+   }
+   if (swap)
+   m[swap] = ptup;
+   /* This and all heap nodes after are now 
well-positioned */
+   }
}
-   Assert(state-memtupcount == ntuples);
 
state-currentRun = 0;
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor performance improvement in transition to external sort

2014-02-04 Thread Michael Paquier

On Wed, Feb 5, 2014 at 7:22 AM, Jeremy Harris j...@wizmail.org wrote:
 The attached patch replaces the existing siftup method for heapify with
 a siftdown method. Tested with random integers it does 18% fewer
 compares and takes 10% less time for the heapify, over the work_mem
 range 1024 to 1048576.

 Both algorithms appear to be O(n) (contradicting Wikipedia's claim
 that a siftup heapify is O(n log n)).
It looks interesting but it is too late to have that in 9.4... You
should definitely add this patch to the next commit fest so as it is
not lost in the void:
https://commitfest.postgresql.org/action/commitfest_view?id=22
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor performance improvement in transition to external sort

Re: [HACKERS] Minor performance improvement in transition to external sort

Re: [HACKERS] Minor performance improvement in transition to external sort

Re: [HACKERS] Minor performance improvement in transition to external sort

Re: [HACKERS] Minor performance improvement in transition to external sort

Re: [HACKERS] Minor performance improvement in transition to external sort

Re: [HACKERS] Minor performance improvement in transition to external sort

Re: [HACKERS] Minor performance improvement in transition to external sort

Re: [HACKERS] Minor performance improvement in transition to external sort

Re: [HACKERS] Minor performance improvement in transition to external sort

Re: [HACKERS] Minor performance improvement in transition to external sort

Re: [HACKERS] Minor performance improvement in transition to external sort

Re: [HACKERS] Minor performance improvement in transition to external sort

Re: [HACKERS] Minor performance improvement in transition to external sort

Re: [HACKERS] Minor performance improvement in transition to external sort

Re: [HACKERS] Minor performance improvement in transition to external sort

[HACKERS] Minor performance improvement in transition to external sort

Re: [HACKERS] Minor performance improvement in transition to external sort

18 matches

Site Navigation

Mail list logo

Footer information