avamingli commented on PR #685: URL: https://github.com/apache/cloudberry/pull/685#issuecomment-2490325900
> I took a look at orca, it has already optimized `distinct` function. > > ``` > explain select distinct(count(a)) from foo; > QUERY PLAN > ------------------------------------------------------------------------------------ > Finalize Aggregate (cost=0.00..526.96 rows=1 width=8) > -> Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..526.96 rows=1 width=8) > -> Partial Aggregate (cost=0.00..526.96 rows=1 width=8) > -> Seq Scan on foo (cost=0.00..500.67 rows=3333334 width=4) > Optimizer: Pivotal Optimizer (GPORCA) > (5 rows) > ``` > > Even if with `group by` , the `distinct` also can be removed > > ``` > explain select distinct(count(a)) from foo group by a ; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------ > Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..1395.69 rows=1000 width=8) > -> HashAggregate (cost=0.00..1395.66 rows=334 width=8) > Group Key: (count(a)) > -> Redistribute Motion 3:3 (slice2; segments: 3) (cost=0.00..1395.62 rows=334 width=8) > Hash Key: (count(a)) > -> Streaming HashAggregate (cost=0.00..1395.61 rows=334 width=8) > Group Key: count(a) > -> HashAggregate (cost=0.00..985.15 rows=3333334 width=8) > Group Key: a > Planned Partitions: 16 > -> Redistribute Motion 3:3 (slice3; segments: 3) (cost=0.00..567.20 rows=3333334 width=4) > Hash Key: a > -> Seq Scan on foo (cost=0.00..500.67 rows=3333334 width=4) > Optimizer: Pivotal Optimizer (GPORCA) > (14 rows) > ``` > > as distinct is a function which only works in a group. > > The function called `PexprRemoveSuperfluousDistinctInDQA` in orca. Yeah, see https://github.com/apache/cloudberry/discussions/677#discussioncomment-10966471 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
