Re: [PATCH] Incremental sort (was: PoC: Partial sort)

Tomas Vondra Wed, 08 Apr 2020 08:10:55 -0700

On Wed, Apr 08, 2020 at 04:08:39PM +0200, Tomas Vondra wrote:

On Wed, Apr 08, 2020 at 09:54:42AM -0400, James Coleman wrote:

On Wed, Apr 8, 2020 at 9:43 AM Tomas Vondra
<tomas.von...@2ndquadrant.com> wrote:


On Wed, Apr 08, 2020 at 12:51:05PM +0200, Tomas Vondra wrote:

On Tue, Apr 07, 2020 at 11:54:23PM -0400, Tom Lane wrote:

hyrax is not too happy with this test:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hyrax&dt=2020-04-07%2004%3A55%3A15

It's not too clear to me why CLOBBER_CACHE_ALWAYS would be breaking
EXPLAIN output, but it evidently is.


Thanks, I'll investigate. It's not clear to me either what might be
causing this, but I guess something must have gone wrong in
estimation/planning.


OK, I know what's going on - it's a rather embarassing issue in the
regression test. There's no analyze on the test tables, so it uses
default estimates for number of groups etc. But with clobber cache the
test runs long enough for autoanalyze to kick in and collect stats, so
we generate better estimates which changes the plan.

I'll get this fixed - explicit analyze and tweaking the data a bit
should do the trick.


Looking at the tests that failed, I think we should consider just adding:
set enable_sort = off;
because several of those tests have very specific amounts of data to
ensure we test the transition points around the different modes in the
incremental sort node.


Maybe, but I'd much rather tweak the data so that we test both the
costing and execution part.


I do think this does the trick by increasing the number of rows a bit
(from 100 to 1000) to make the Sort more expensive than Incremental
Sort, while still testing the transition points.

James, can you verify it that's still true?

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

diff --git a/src/test/regress/expected/incremental_sort.out 
b/src/test/regress/expected/incremental_sort.out
index 3072d95643..fb4ab95922 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -141,7 +141,8 @@ begin
 end;
 $$;
 -- A single large group tested around each mode transition point.
-insert into t(a, b) select 1, i from generate_series(1, 100) n(i);
+insert into t(a, b) select i/100 + 1, i + 1 from generate_series(0, 999) n(i);
+analyze;
 explain (costs off) select * from (select * from t order by a) s order by a, b 
limit 31;
            QUERY PLAN            
 ---------------------------------
@@ -456,7 +457,8 @@ select * from (select * from t order by a) s order by a, b 
limit 66;
 
 delete from t;
 -- An initial large group followed by a small group.
-insert into t(a, b) select (case when i < 50 then 1 else 2 end), i from 
generate_series(1, 100) n(i);
+insert into t(a, b) select i/50 + 1, i + 1 from generate_series(0, 999) n(i);
+analyze;
 explain (costs off) select * from (select * from t order by a) s order by a, b 
limit 55;
            QUERY PLAN            
 ---------------------------------
@@ -521,7 +523,7 @@ select * from (select * from t order by a) s order by a, b 
limit 55;
  1 | 47
  1 | 48
  1 | 49
- 2 | 50
+ 1 | 50
  2 | 51
  2 | 52
  2 | 53
@@ -538,10 +540,10 @@ select explain_analyze_without_memory('select * from 
(select * from t order by a
          Sort Key: t.a, t.b
          Presorted Key: t.a
          Full-sort Groups: 2 Sort Methods: top-N heapsort, quicksort Memory: 
avg=NNkB peak=NNkB
-         ->  Sort (actual rows=100 loops=1)
+         ->  Sort (actual rows=101 loops=1)
                Sort Key: t.a
                Sort Method: quicksort  Memory: NNkB
-               ->  Seq Scan on t (actual rows=100 loops=1)
+               ->  Seq Scan on t (actual rows=1000 loops=1)
 (9 rows)
 
 select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * 
from (select * from t order by a) s order by a, b limit 55'));
@@ -584,7 +586,8 @@ select 
explain_analyze_inc_sort_nodes_verify_invariants('select * from (select *
 
 delete from t;
 -- An initial small group followed by a large group.
-insert into t(a, b) select (case when i < 5 then i else 9 end), i from 
generate_series(1, 100) n(i);
+insert into t(a, b) select (case when i < 5 then i else 9 end), i from 
generate_series(1, 1000) n(i);
+analyze;
 explain (costs off) select * from (select * from t order by a) s order by a, b 
limit 70;
            QUERY PLAN            
 ---------------------------------
@@ -705,17 +708,17 @@ select * from t left join (select * from (select * from t 
order by a) v order by
 rollback;
 -- Test EXPLAIN ANALYZE with both fullsort and presorted groups.
 select explain_analyze_without_memory('select * from (select * from t order by 
a) s order by a, b limit 70');
-                                                           
explain_analyze_without_memory                                                  
          
------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                    
explain_analyze_without_memory                                                  
                  
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit (actual rows=70 loops=1)
    ->  Incremental Sort (actual rows=70 loops=1)
          Sort Key: t.a, t.b
          Presorted Key: t.a
-         Full-sort Groups: 1 Sort Method: quicksort Memory: avg=NNkB peak=NNkB 
Presorted Groups: 5 Sort Method: quicksort Memory: avg=NNkB peak=NNkB
-         ->  Sort (actual rows=100 loops=1)
+         Full-sort Groups: 1 Sort Method: quicksort Memory: avg=NNkB peak=NNkB 
Presorted Groups: 5 Sort Methods: top-N heapsort, quicksort Memory: avg=NNkB 
peak=NNkB
+         ->  Sort (actual rows=1000 loops=1)
                Sort Key: t.a
                Sort Method: quicksort  Memory: NNkB
-               ->  Seq Scan on t (actual rows=100 loops=1)
+               ->  Seq Scan on t (actual rows=1000 loops=1)
 (9 rows)
 
 select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * 
from (select * from t order by a) s order by a, b limit 70'));
@@ -747,6 +750,7 @@ select 
jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * from
          "Presorted Groups": {                  +
              "Group Count": 5,                  +
              "Sort Methods Used": [             +
+                 "top-N heapsort",              +
                  "quicksort"                    +
              ],                                 +
              "Sort Space Memory": {             +
@@ -767,7 +771,8 @@ select 
explain_analyze_inc_sort_nodes_verify_invariants('select * from (select *
 
 delete from t;
 -- Small groups of 10 tuples each tested around each mode transition point.
-insert into t(a, b) select i / 10, i from generate_series(1, 70) n(i);
+insert into t(a, b) select i / 10, i from generate_series(1, 1000) n(i);
+analyze;
 explain (costs off) select * from (select * from t order by a) s order by a, b 
limit 31;
            QUERY PLAN            
 ---------------------------------
@@ -1082,7 +1087,8 @@ select * from (select * from t order by a) s order by a, 
b limit 66;
 
 delete from t;
 -- Small groups of only 1 tuple each tested around each mode transition point.
-insert into t(a, b) select i, i from generate_series(1, 70) n(i);
+insert into t(a, b) select i, i from generate_series(1, 1000) n(i);
+analyze;
 explain (costs off) select * from (select * from t order by a) s order by a, b 
limit 31;
            QUERY PLAN            
 ---------------------------------
diff --git a/src/test/regress/sql/incremental_sort.sql 
b/src/test/regress/sql/incremental_sort.sql
index e78a96d5bf..cf304a3441 100644
--- a/src/test/regress/sql/incremental_sort.sql
+++ b/src/test/regress/sql/incremental_sort.sql
@@ -119,7 +119,8 @@ end;
 $$;
 
 -- A single large group tested around each mode transition point.
-insert into t(a, b) select 1, i from generate_series(1, 100) n(i);
+insert into t(a, b) select i/100 + 1, i + 1 from generate_series(0, 999) n(i);
+analyze;
 explain (costs off) select * from (select * from t order by a) s order by a, b 
limit 31;
 select * from (select * from t order by a) s order by a, b limit 31;
 explain (costs off) select * from (select * from t order by a) s order by a, b 
limit 32;
@@ -133,7 +134,8 @@ select * from (select * from t order by a) s order by a, b 
limit 66;
 delete from t;
 
 -- An initial large group followed by a small group.
-insert into t(a, b) select (case when i < 50 then 1 else 2 end), i from 
generate_series(1, 100) n(i);
+insert into t(a, b) select i/50 + 1, i + 1 from generate_series(0, 999) n(i);
+analyze;
 explain (costs off) select * from (select * from t order by a) s order by a, b 
limit 55;
 select * from (select * from t order by a) s order by a, b limit 55;
 -- Test EXPLAIN ANALYZE with only a fullsort group.
@@ -143,7 +145,8 @@ select 
explain_analyze_inc_sort_nodes_verify_invariants('select * from (select *
 delete from t;
 
 -- An initial small group followed by a large group.
-insert into t(a, b) select (case when i < 5 then i else 9 end), i from 
generate_series(1, 100) n(i);
+insert into t(a, b) select (case when i < 5 then i else 9 end), i from 
generate_series(1, 1000) n(i);
+analyze;
 explain (costs off) select * from (select * from t order by a) s order by a, b 
limit 70;
 select * from (select * from t order by a) s order by a, b limit 70;
 -- Test rescan.
@@ -164,7 +167,8 @@ select 
explain_analyze_inc_sort_nodes_verify_invariants('select * from (select *
 delete from t;
 
 -- Small groups of 10 tuples each tested around each mode transition point.
-insert into t(a, b) select i / 10, i from generate_series(1, 70) n(i);
+insert into t(a, b) select i / 10, i from generate_series(1, 1000) n(i);
+analyze;
 explain (costs off) select * from (select * from t order by a) s order by a, b 
limit 31;
 select * from (select * from t order by a) s order by a, b limit 31;
 explain (costs off) select * from (select * from t order by a) s order by a, b 
limit 32;
@@ -178,7 +182,8 @@ select * from (select * from t order by a) s order by a, b 
limit 66;
 delete from t;
 
 -- Small groups of only 1 tuple each tested around each mode transition point.
-insert into t(a, b) select i, i from generate_series(1, 70) n(i);
+insert into t(a, b) select i, i from generate_series(1, 1000) n(i);
+analyze;
 explain (costs off) select * from (select * from t order by a) s order by a, b 
limit 31;
 select * from (select * from t order by a) s order by a, b limit 31;
 explain (costs off) select * from (select * from t order by a) s order by a, b 
limit 32;

Re: [PATCH] Incremental sort (was: PoC: Partial sort)

Reply via email to