> On Jul 21, 2025, at 00:03, Tom Lane <t...@sss.pgh.pa.us> wrote: > > feichanghong <feichangh...@qq.com> writes: >> Currently, I have not found a better way to rewrite this, except by >> optimizing >> this scenario from the pg kernel side. > > If you're willing to modify your query, you could fake it out by > spelling the subquery's "a = 1" condition in a way that won't produce an > EquivalenceClass. For example, > > regression=# explain analyze select a, b from ( > (select a, b from t t1 where a > 19000 order by a, b) > union all > (select a, b from t t2 where a >= 1 and a <= 1 and b > 1 order by a, b) > ) t order by a, b limit 1; > QUERY PLAN > > ------------------------------------------------------------------------------------------------------------------------------------------ > Limit (cost=0.58..0.63 rows=1 width=8) (actual time=0.070..0.070 rows=1.00 > loops=1) > Buffers: shared hit=8 read=1 > -> Merge Append (cost=0.58..481.10 rows=11000 width=8) (actual > time=0.069..0.069 rows=1.00 loops=1) > Sort Key: t1.a, t1.b > Buffers: shared hit=8 read=1 > -> Index Only Scan using t_a_b_idx on t t1 (cost=0.29..29.79 > rows=1000 width=8) (actual time=0.027..0.027 rows=1.00 loops=1) > Index Cond: (a > 19000) > Heap Fetches: 0 > Index Searches: 1 > Buffers: shared hit=2 read=1 > -> Index Only Scan using t_a_b_idx on t t2 (cost=0.29..341.30 > rows=10000 width=8) (actual time=0.041..0.041 rows=1.00 loops=1) > Index Cond: ((a >= 1) AND (a <= 1) AND (b > 1)) > Heap Fetches: 0 > Index Searches: 1 > Buffers: shared hit=6 > Planning: > Buffers: shared hit=6 > Planning Time: 0.174 ms > Execution Time: 0.089 ms > (19 rows)
Thank you very much for your suggestion, this method is effective for my scenario. I will modify my SQL accordingly. > I'd be the first to agree that that's a hack not a nice solution. > But I think getting to something that's not a hack is going to > involve a lot more work than this edge case seems worth. We're > not likely to accept a patch that pessimizes planning within > subqueries on the small chance that that will result in a path > whose apparent sort order matches the needs of the outer query > better. Maybe something could be done inside > convert_subquery_pathkeys, but I suspect we don't really have > enough information at that point to decide what to do. Yes, if we retain the EC_MUST_BE_REDUNDANT pathkey in the subquery, it may lead to a less optimal plan. For example, consider the following case: the subquery is sorted by "a,b", and the top-level query is sorted by "b". In this situation, the subquery's pathkey "b" matches the top-level query. However, if we retain the subquery's pathkey as "a,b", it does not match the top-level query, resulting in the need for an additional sort operation. ```sql -- subquery pathkey is "b" postgres=# explain select * from (select * from t where a = 1 and b > 1 order by a, b) t order by b limit 1; QUERY PLAN ----------------------------------------------------------------------------------- Limit (cost=0.29..0.33 rows=1 width=8) -> Index Only Scan using t_a_b_idx on t (cost=0.29..316.27 rows=9999 width=8) Index Cond: ((a = 1) AND (b > 1)) (3 rows) -- subquery pathkey is "a,b" postgres=# explain select * from (select * from t where a = 1 and b > 1 order by a, b) t order by b limit 1; QUERY PLAN ----------------------------------------------------------------------------------------------- Limit (cost=1155.56..1155.56 rows=1 width=8) -> Sort (cost=1155.56..1180.56 rows=9999 width=8) Sort Key: t.b -> Sort (cost=980.58..1005.58 rows=9999 width=8) Sort Key: t.a, t.b -> Index Only Scan using t_a_b_idx on t (cost=0.29..316.27 rows=9999 width=8) Index Cond: ((a = 1) AND (b > 1)) (7 rows) ``` A possible solution might be: we retain the EC_MUST_BE_REDUNDANT pathkey within the subquery, but when executing create_sort_plan, pathkeys_contained_in, and convert_subquery_pathkeys, we need to take into account pathkeys with ec_has_const. Naturally, the actual situation could be much more complex. Best Regards, Fei Changhong