Hi, > -----Original Message----- > From: Robert Haas [mailto:robertmh...@gmail.com] > Sent: Tuesday, June 19, 2012 3:12 AM > To: Ants Aasma > Cc: Etsuro Fujita; Jay Levitt; Tom Lane; PostgreSQL-development; Francois > Deliege > Subject: Re: [HACKERS] [PATCH] Lazy hashaggregate when no aggregation is > needed > > On Fri, Jun 15, 2012 at 9:22 AM, Ants Aasma <a...@cybertec.at> wrote: > > Exactly. I think the first question for this patch should be whether > > this use-case is worth the complexity of the patch. I can't imagine > > any really compelling use cases that need an arbitrary distinct subset > > of results. > > Me neither. > > > The original complaint on -performance [1], didn't specify a real > > world use case, but it seemed to be a case of an ORM generating > > suboptimal queries. On the other hand, the patch itself is in my > > opinion rather simple, so it might be worth it. > > Yeah. > > > It has one outstanding issue, query_planner chooses the cheapest path > > based on total cost. This can be suboptimal when that path happens to > > have high startup cost. It seems to me that enabling the query_planner > > to find the cheapest unsorted path returning a limited amount of > > tuples would require some major surgery to the planner. To be clear, > > this is only a case of missed optimization, not a regression. > > I'm confused by this remark, because surely the query planner does it this > way only if there's no LIMIT. When there is a LIMIT, we choose based on > the startup cost plus the estimated fraction of the total cost we expect > to pay based on dividing the LIMIT by the overall row count estimate. Or > is this not what you're talking about?
I think that Ants is pointing the way of estimating costs in choose_hashed_grouping()/choose_hashed_distinct(), ie cost_agg() for cheapest_path + hashagg, where the costs are calculated based on the total cost only of cheapest_path. I think that it might be good to do cost_agg() for the discussed case with the AGG_SORTED strategy, not the AGG_HASHED strategy. > > It won't help set returning functions because the tuplestore for those > > is fully materialized when the first row is fetched. > > That's surely a problem for another day. OK Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers