PostgreSQL version: 16.1 Operating system: centos7 Description: Let me show these explain results first, in PG9.4 and PG16.1.
### Behavior in PG9.4 ``` SQL gpadmin=# create table t1 (c1 int, c2 text); CREATE TABLE gpadmin=# explain (costs off, verbose) select distinct c1 from t1; QUERY PLAN ----------------------------- HashAggregate Output: c1 Group Key: t1.c1 -> Seq Scan on public.t1 Output: c1 <---- pay attention <---- !!! (5 rows) ``` ### Behavior in PG 16.1 ``` SQL gpadmin=# create table t1 (c1 int, c2 text); CREATE TABLE gpadmin=# explain (costs off, verbose) select distinct c1 from t1; QUERY PLAN ----------------------------- HashAggregate Output: c1 Group Key: t1.c1 -> Seq Scan on public.t1 Output: c1, c2 <---- pay attention <---- !!! (5 rows) ``` My question is why scan all columns in PG 16.01? If `select distinct c1`, scan the column `c1` is enough, like PG 9.4. Related GPDB issue link: https://github.com/greenplum-db/gpdb/issues/15266 Reporter: David Kimura and Yongtao Huang