Re: Even when the data is already ordered, MergeAppend still adds a Sort node

feichanghong Sun, 20 Jul 2025 18:59:02 -0700


> On Jul 21, 2025, at 00:03, Tom Lane <t...@sss.pgh.pa.us> wrote:
> 
> feichanghong <feichangh...@qq.com> writes:
>> Currently, I have not found a better way to rewrite this, except by 
>> optimizing
>> this scenario from the pg kernel side.
> 
> If you're willing to modify your query, you could fake it out by
> spelling the subquery's "a = 1" condition in a way that won't produce an
> EquivalenceClass.  For example,
> 
> regression=# explain analyze select a, b from (
>    (select a, b from t t1 where a > 19000 order by a, b)
>    union all
>    (select a, b from t t2 where a >= 1 and a <= 1 and b > 1 order by a, b)
> ) t order by a, b limit 1;
>                                                                QUERY PLAN     
>                                                            
> ------------------------------------------------------------------------------------------------------------------------------------------
> Limit  (cost=0.58..0.63 rows=1 width=8) (actual time=0.070..0.070 rows=1.00 
> loops=1)
>   Buffers: shared hit=8 read=1
>   ->  Merge Append  (cost=0.58..481.10 rows=11000 width=8) (actual 
> time=0.069..0.069 rows=1.00 loops=1)
>         Sort Key: t1.a, t1.b
>         Buffers: shared hit=8 read=1
>         ->  Index Only Scan using t_a_b_idx on t t1  (cost=0.29..29.79 
> rows=1000 width=8) (actual time=0.027..0.027 rows=1.00 loops=1)
>               Index Cond: (a > 19000)
>               Heap Fetches: 0
>               Index Searches: 1
>               Buffers: shared hit=2 read=1
>         ->  Index Only Scan using t_a_b_idx on t t2  (cost=0.29..341.30 
> rows=10000 width=8) (actual time=0.041..0.041 rows=1.00 loops=1)
>               Index Cond: ((a >= 1) AND (a <= 1) AND (b > 1))
>               Heap Fetches: 0
>               Index Searches: 1
>               Buffers: shared hit=6
> Planning:
>   Buffers: shared hit=6
> Planning Time: 0.174 ms
> Execution Time: 0.089 ms
> (19 rows)


Thank you very much for your suggestion, this method is effective for my
scenario. I will modify my SQL accordingly.

> I'd be the first to agree that that's a hack not a nice solution.
> But I think getting to something that's not a hack is going to
> involve a lot more work than this edge case seems worth.  We're
> not likely to accept a patch that pessimizes planning within
> subqueries on the small chance that that will result in a path
> whose apparent sort order matches the needs of the outer query
> better.  Maybe something could be done inside
> convert_subquery_pathkeys, but I suspect we don't really have
> enough information at that point to decide what to do.

Yes, if we retain the EC_MUST_BE_REDUNDANT pathkey in the subquery, it may lead
to a less optimal plan. For example, consider the following case: the subquery
is sorted by "a,b", and the top-level query is sorted by "b". In this
situation, the subquery's pathkey "b" matches the top-level query. However, if
we retain the subquery's pathkey as "a,b", it does not match the top-level
query, resulting in the need for an additional sort operation.
```sql
-- subquery pathkey is "b"
postgres=# explain select * from (select * from t where a = 1 and b > 1 order 
by a, b) t order by b limit 1;
                                    QUERY PLAN
-----------------------------------------------------------------------------------
 Limit  (cost=0.29..0.33 rows=1 width=8)
   ->  Index Only Scan using t_a_b_idx on t  (cost=0.29..316.27 rows=9999 
width=8)
         Index Cond: ((a = 1) AND (b > 1))
(3 rows)

-- subquery pathkey is "a,b"
postgres=# explain select * from (select * from t where a = 1 and b > 1 order 
by a, b) t order by b limit 1;
                                          QUERY PLAN
-----------------------------------------------------------------------------------------------
 Limit  (cost=1155.56..1155.56 rows=1 width=8)
   ->  Sort  (cost=1155.56..1180.56 rows=9999 width=8)
         Sort Key: t.b
         ->  Sort  (cost=980.58..1005.58 rows=9999 width=8)
               Sort Key: t.a, t.b
               ->  Index Only Scan using t_a_b_idx on t  (cost=0.29..316.27 
rows=9999 width=8)
                     Index Cond: ((a = 1) AND (b > 1))
(7 rows)
```

A possible solution might be: we retain the EC_MUST_BE_REDUNDANT pathkey within
the subquery, but when executing create_sort_plan, pathkeys_contained_in, and
convert_subquery_pathkeys, we need to take into account pathkeys with
ec_has_const. Naturally, the actual situation could be much more complex.

Best Regards,
Fei Changhong

Re: Even when the data is already ordered, MergeAppend still adds a Sort node

Reply via email to