Richard Guo <guofengli...@gmail.com> writes: > On Wed, Dec 28, 2022 at 6:18 AM Tom Lane <t...@sss.pgh.pa.us> wrote: >> This patch is aimed at being smarter about cases where we have >> redundant GROUP BY entries, for example >> SELECT ... WHERE a.x = b.y GROUP BY a.x, b.y;
> While we are here, I wonder if we can do the same trick for > distinctClause, to cope with cases like > select distinct a.x, b.y from a, b where a.x = b.y; We do that already, no? regression=# create table foo (x int, y int); CREATE TABLE regression=# explain select distinct * from foo where x = 1; QUERY PLAN ----------------------------------------------------------------- Unique (cost=38.44..38.50 rows=11 width=8) -> Sort (cost=38.44..38.47 rows=11 width=8) Sort Key: y -> Seq Scan on foo (cost=0.00..38.25 rows=11 width=8) Filter: (x = 1) (5 rows) regression=# explain select distinct * from foo where x = y; QUERY PLAN ----------------------------------------------------------------- Unique (cost=38.44..38.50 rows=11 width=8) -> Sort (cost=38.44..38.47 rows=11 width=8) Sort Key: x -> Seq Scan on foo (cost=0.00..38.25 rows=11 width=8) Filter: (x = y) (5 rows) But if you do regression=# explain select * from foo where x = y group by x, y; QUERY PLAN ----------------------------------------------------------------- Group (cost=38.44..38.52 rows=11 width=8) Group Key: x, y -> Sort (cost=38.44..38.47 rows=11 width=8) Sort Key: x -> Seq Scan on foo (cost=0.00..38.25 rows=11 width=8) Filter: (x = y) (6 rows) then you can see that the Sort step knows it need only consider one column even though the Group step considers both. regards, tom lane