On 9/23/24 05:03, Richard Guo wrote: > On Sun, Sep 22, 2024 at 1:38 PM David Rowley <dgrowle...@gmail.com> wrote: >> Just looking at the commit message: >> >>> The rationale is based on the assumption that incremental sort is >>> always faster than full sort when there are presorted keys, a premise >>> that has been applied in various parts of the code. This assumption >>> does not always hold, particularly in cases with a large skew in the >>> number of rows within the presorted groups. >> >> My understanding is that the worst case for incremental sort is the >> same as sort, i.e. only 1 presorted group, which is the same effort to >> sort. Is there something I'm missing? > > I was thinking that when there’s a large skew in the number of rows > per pre-sorted group, incremental sort might underperform full sort. > As mentioned in the comments for cost_incremental_sort, it has to > detect the sort groups, plus it needs to perform tuplesort_reset after > each group. However, I doubt these factors would have a substantial > impact on the performance of incremental sort. So maybe in the worst > skewed groups case, incremental sort may still perform similarly to > full sort. > > Another less obvious reason is that in cases of skewed groups, we may > significantly underestimate the cost of incremental sort. This could > result in choosing a more expensive subpath under the sort. Such as > the example in [1], we end up with IndexScan->IncrementalSort rather > than SeqScan->FullSort, while the former plan is more expensive to > execute. However, this point does not affect this patch, because for > a mergejoin, we only consider outerrel's cheapest-total-cost when we > assume that an explicit sort atop is needed, i.e., we do not have a > chance to select which subpath to use in this case. > > [1] > https://postgr.es/m/CAMbWs49+CXsrgbq0LD1at-5jR=ahhn0ytdy9yvgxasmfndz...@mail.gmail.com >
You don't need any skew. Consider this perfectly uniform dataset: create table t (a int, b int); insert into t select 100000 * random(), 100 * random() from generate_series(1,1000000) s(i); create index on t (a); vacuum analyze; checkpoint; explain analyze select * from t order by a, b; QUERY PLAN ----------------------------------------------------------------- Sort (cost=127757.34..130257.34 rows=1000000 width=8) (actual time=186.288..275.813 rows=1000000 loops=1) Sort Key: a, b Sort Method: external merge Disk: 17704kB -> Seq Scan on t (cost=0.00..14425.00 rows=1000000 width=8) (actual time=0.005..35.989 rows=1000000 loops=1) Planning Time: 0.075 ms Execution Time: 301.143 ms (6 rows) set enable_incremental_sort = on; explain analyze select * from t order by a, b; QUERY PLAN ----------------------------------------------------------------- Incremental Sort (cost=1.03..68908.13 rows=1000000 width=8) (actual time=0.102..497.143 rows=1000000 loops=1) Sort Key: a, b Presorted Key: a Full-sort Groups: 27039 Sort Method: quicksort Average Memory: 25kB Peak Memory: 25kB -> Index Scan using t_a_idx on t (cost=0.42..37244.25 rows=1000000 width=8) (actual time=0.050..376.403 rows=1000000 loops=1) Planning Time: 0.057 ms Execution Time: 519.301 ms (7 rows) Sure, the table is very small, but the example is not crazy. In fact, this is the "nicest" example for estimation - it's perfectly random, no correlations, no skew. regards -- Tomas Vondra