Hi Ilia, Zsolt, +1 to Zsolt's hasTargetSRFs point -- I built v1 and reproduced the wrong results:
CREATE TABLE t (a int, b int, c int); INSERT INTO t VALUES (1,1,10),(1,2,20),(2,1,30),(2,2,40),(1,1,50); SELECT DISTINCT a, unnest(ARRAY[1,1]) AS u FROM t GROUP BY a ORDER BY a, u; on master: (1,1),(2,1) # DISTINCT collapses the unnest dups with v1: (1,1),(1,1),(2,1),(2,1) # DISTINCT removed, dups leak The SRF in the tlist is expanded after grouping but before DISTINCT, so the DISTINCT is not redundant. Window functions are safe (they don't change cardinality), so hasTargetSRFs is the right and sufficient guard. On Zsolt's suggetion to reuse query_is_distinct_for(): I think that's the right direction, and it fixes more than the SRF case. The GROUP BY branch there also checks equality_ops_are_compatible(opid, sgc->eqop) and collations_agree_on_equality(...), neither of which the new distinct_redundant_by_groupby() checks (it matches on tleSortGroupRef alone). So reusing it closes the SRF, opclass, and collation gaps at once. One wrinkle if you go that way: query_is_distinct_for() can't be called directly for this, because its first branch inspects query->distinctClause and would return true trivially (the query's own DISTINCT "proves" distinctness on its own columns). The reusable part is specifically the GROUP-BY branch (plus the hasTargetSRFs precheck). Extracting that into a helper that takes the distinct columns and is called from both sites seems like the cleanest shape. FWIW there's a complementary case this structure could host later: DISTINCT made redundant by a unique index/constraint (SELECT DISTINCT pk FROM t), proven via relation_has_unique_index_for(). That one needs a NULL gate that the GROUP BY case doesn't, so it's separate logic -- but a shared "is this DISTINCT redundant?" entry point with pluggable proofs would fit both. Happy to follow up with that as a separate patch once this lands. I tried signing up as a reviewer, but I need to wait for my one week cool-off period first. I sign up ASAP. Thanks for working on this! On Thu, Jun 11, 2026 at 6:01 PM Zsolt Parragi <[email protected]> wrote: > > Hello! > > +static bool > +distinct_redundant_by_groupby(Query *parse) > > I think this also should check hasTargetSRFs: > > CREATE TABLE t (a int, b int, c int); > INSERT INTO t VALUES (1,1,10),(1,2,20),(2,1,30),(2,2,40),(1,1,50); > SELECT DISTINCT a, unnest(ARRAY[1,1]) AS u FROM t GROUP BY a ORDER BY a, u; > > Also, query_is_distinct_for already does something similar (and > already handles hasTargetSRFs) - would it be possible to extract the > common part instead of duplicating the logic?
