Re: [HACKERS] Status of DISTINCT-by-hashing work

Gregory Stark Tue, 05 Aug 2008 08:51:12 -0700

"Tom Lane" <[EMAIL PROTECTED]> writes:

> I've pretty much finished the project I got a bee in my bonnet about
> last week, which is to teach SELECT DISTINCT how to (optionally) use
> hashing for grouping in the same way that GROUP BY has been able to do
> for awhile.
>
> There are still two places in the system that hard-wire the use of
> sorting for duplicate elimination:
>
> * Set operations (UNION/INTERSECT/EXCEPT)


Egads. Are you thinking to reimplement them more in line with the way other
nodes work? Or just have them choose between hashing and sorting themselves?

> * Aggregate functions with DISTINCT
>
> I'm thinking of trying to fix set operations before I leave this topic,
> but I'm not sure it's worth the trouble to change DISTINCT aggregates.
> They'd be a lot more work (since there's no executor infrastructure
> in place that could be used) and the return on investment seems low.
>
> Comments?

I recall being quite mystified by how distinct aggregates work when the sort
didn't appear anywhere in EXPLAIN output. If we could manage to expose that
info in the plan somehow it would be a great improvement even if we didn't
actually improve the plans available.

Any idea what would the needed executor infrastructure look like? Would it
have anything in common with the OLAP window functions infrastructure?

-- 
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Status of DISTINCT-by-hashing work

Reply via email to