Re: [HACKERS] [PATCH] Incremental sort

Heikki Linnakangas Mon, 20 Mar 2017 07:20:35 -0700

On 03/20/2017 11:33 AM, Alexander Korotkov wrote:

Please, find rebased patch in the attachment.


I had a quick look at this.

* I'd love to have an explanation of what an Incremental Sort is, in thefile header comment for nodeIncrementalSort.c.

* I didn't understand the maxMem stuff in tuplesort.c. The commentsthere use the phrase "on-disk memory", which seems like an oxymoron.Also, "maximum status" seems weird, as it assumes that there's a naturalorder to the states.

* In the below example, the incremental sort is significantly slowerthan the Seq Scan + Sort you get otherwise:


create table foo (a int4, b int4, c int4);
insert into sorttest select g, g, g from generate_series(1, 1000000) g;
vacuum foo;
create index i_sorttest on sorttest (a, b, c);
set work_mem='100MB';

postgres=# explain select count(*) from (select * from sorttest order bya, c) as t;QUERY PLAN

-------------------------------------------------------------------------------------------------------
 Aggregate  (cost=138655.68..138655.69 rows=1 width=8)
   ->  Incremental Sort  (cost=610.99..124870.38 rows=1102824 width=12)
         Sort Key: sorttest.a, sorttest.c
         Presorted Key: sorttest.a

-> Index Only Scan using i_sorttest on sorttest(cost=0.43..53578.79 rows=1102824 width=12)

(5 rows)

Time: 0.409 ms
postgres=# select count(*) from (select * from sorttest order by a, c) as t;
  count
---------
 1000000
(1 row)

Time: 387.091 ms

postgres=# explain select count(*) from (select * from sorttest order bya, c) as t;QUERY PLAN

-------------------------------------------------------------------------------
 Aggregate  (cost=130063.84..130063.85 rows=1 width=8)
   ->  Sort  (cost=115063.84..117563.84 rows=1000000 width=12)
         Sort Key: sorttest.a, sorttest.c

-> Seq Scan on sorttest (cost=0.00..15406.00 rows=1000000width=12)

(4 rows)

Time: 0.345 ms
postgres=# select count(*) from (select * from sorttest order by a, c) as t;
  count
---------
 1000000
(1 row)

Time: 231.668 ms

According to 'perf', 85% of the CPU time is spent in ExecCopySlot(). Toalleviate that, it might be worthwhile to add a special case for whenthe group contains exactly one group, and not put the tuple to thetuplesort in that case. Or if we cannot ensure that the Incremental Sortis actually faster, the cost model should probably be smarter, to avoidpicking an incremental sort when it's not a win.


- Heikki



--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Incremental sort

Reply via email to