Hi,

There's another small regression with this patch when used with expensive
comparison functions, such as long text fields.

If we go through all this trouble in cmpSortSkipCols to prove that the
first N sortkeys are equal, it would be nice if Tuplesort could skip their
comparisons entirely; that's another nice optimization this patch can
provide.

I've implemented that in the attached patch, which applies on top of your
partial-sort-5.patch

Should the "Sort Key" field in EXPLAIN output be changed as well? I'd say
no, I think that makes the partial sort steps harder to read.

Generate test data:
create table longtext as select (select repeat('a', 1000*100)) a,
generate_series(1,1000) i;
create index on longtext(a);

Unpatched (using your original partial-sort-5.patch):
=# explain analyze select * from longtext order by a, i limit 10;
 Limit  (cost=2.34..19.26 rows=10 width=1160) (actual time=13477.739..13477.756
rows=10 loops=1)
   ->  Partial sort  (cost=2.34..1694.15 rows=1000 width=1160) (actual time=
13477.737..13477.742 rows=10 loops=1)
         Sort Key: a, i
         Presorted Key: a
         Sort Method: top-N heapsort  Memory: 45kB
         ->  Index Scan using longtext_a_idx on longtext  (cost=0.65..1691.65
rows=1000 width=1160) (actual time=0.015..2.364 rows=1000 loops=1)
 Total runtime: 13478.158 ms
(7 rows)

=# set enable_indexscan=off;
=# explain analyze select * from longtext order by a, i limit 10;
 Limit  (cost=198.61..198.63 rows=10 width=1160) (actual
time=6970.439..6970.458 rows=10 loops=1)
   ->  Sort  (cost=198.61..201.11 rows=1000 width=1160) (actual
time=6970.438..6970.444 rows=10 loops=1)
         Sort Key: a, i
         Sort Method: top-N heapsort  Memory: 45kB
         ->  Seq Scan on longtext  (cost=0.00..177.00 rows=1000 width=1160)
(actual time=0.007..1.763 rows=1000 loops=1)
 Total runtime: 6970.491 ms

Patched:
=# explain analyze select * from longtext order by a, i ;
 Partial sort  (cost=2.34..1694.15 rows=1000 width=1160) (actual
time=0.024..4.603 rows=1000 loops=1)
   Sort Key: a, i
   Presorted Key: a
   Sort Method: quicksort  Memory: 27kB
   ->  Index Scan using longtext_a_idx on longtext  (cost=0.65..1691.65
rows=1000 width=1160) (actual time=0.013..2.094 rows=1000 loops=1)
 Total runtime: 5.418 ms

Regards,
Marti
From fbc6c31528018bca64dc54f65e1cd292f8de482a Mon Sep 17 00:00:00 2001
From: Marti Raudsepp <ma...@juffo.org>
Date: Sat, 18 Jan 2014 19:16:15 +0200
Subject: [PATCH] Batched sort: skip comparisons for known-equal columns

---
 src/backend/executor/nodeSort.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index cf1f79e..5abda1d 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -125,10 +125,10 @@ ExecSort(SortState *node)
 	{
 		tuplesortstate = tuplesort_begin_heap(tupDesc,
 											  plannode->numCols - skipCols,
-											  &(plannode->sortColIdx)[skipCols],
-											  plannode->sortOperators,
-											  plannode->collations,
-											  plannode->nullsFirst,
+											  &(plannode->sortColIdx[skipCols]),
+											  &(plannode->sortOperators[skipCols]),
+											  &(plannode->collations[skipCols]),
+											  &(plannode->nullsFirst[skipCols]),
 											  work_mem,
 											  node->randomAccess);
 		if (node->bounded)
-- 
1.8.5.3

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to