Re: [PERFORM] queries with DISTINCT / GROUP BY giving different plans

Tom Lane Tue, 20 Aug 2013 16:33:41 -0700

Tomas Vondra <[email protected]> writes:
> Not quite sure how to parse this (not a native speaker here, sorry).
> Does that mean we want to keep it as it is now (because fixing it would
> cause even worse errors with low estimates)? Or do we want to fix
> hashed_distinct so that it behaves like hashed_grouping?


We need to fix hashed_distinct like this:

diff --git a/src/backend/optimizer/plan/planner.c 
b/src/backend/optimizer/plan/planner.c
index bcc0d45..99284cb 100644
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*************** choose_hashed_distinct(PlannerInfo *root
*** 2848,2854 ****
--- 2848,2858 ----
         * Don't do it if it doesn't look like the hashtable will fit into
         * work_mem.
         */
+ 
+       /* Estimate per-hash-entry space at tuple width... */
        hashentrysize = MAXALIGN(path_width) + 
MAXALIGN(sizeof(MinimalTupleData));
+       /* plus the per-hash-entry overhead */
+       hashentrysize += hash_agg_entry_size(0);
  
        if (hashentrysize * dNumDistinctRows > work_mem * 1024L)
                return false;

I've started a thread over in -hackers about whether it's prudent to
back-patch this change or not.

                        regards, tom lane


-- 
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] queries with DISTINCT / GROUP BY giving different plans

Reply via email to