Turns out we had not 9.6 but 9.5.

And query plan from 9.5 is:

 Sort  (cost=319008.18..319008.19 rows=1 width=556) (actual
time=0.028..0.028 rows=0 loops=1)
   Sort Key: (sum(st.shows)) DESC
   Sort Method: quicksort  Memory: 25kB
   CTE a
     ->  Index Scan using adroom_active_idx on adroom  (cost=0.13..5.21
rows=1 width=584) (actual time=0.004..0.004 rows=0 loops=1)
           Index Cond: ((now() >= start_ts) AND (now() <= stop_ts))
           Filter: (((groups -> 0) ->> 'provider'::text) ~
'^target_mail_ru'::text)
   CTE b
     ->  HashAggregate  (cost=1.27..1.77 rows=100 width=68) (actual
time=0.005..0.005 rows=0 loops=1)
           Group Key: a.provider, a.id, unnest(a.domain_ids)
           ->  CTE Scan on a  (cost=0.00..0.52 rows=100 width=68) (actual
time=0.004..0.004 rows=0 loops=1)
   ->  HashAggregate  (cost=319001.17..319001.18 rows=1 width=556) (actual
time=0.013..0.013 rows=0 loops=1)
         Group Key: b.provider, d.domain
         ->  Hash Join  (cost=16.55..319001.16 rows=1 width=556) (actual
time=0.013..0.013 rows=0 loops=1)
               Hash Cond: ((st.adroom_id = b.id) AND (st.domain_id =
b.domain_id))
               ->  Hash Join  (cost=13.05..318633.29 rows=48581 width=536)
(never executed)
                     Hash Cond: (st.domain_id = d.id)
                     ->  Index Scan using
adroom_stat_day_adroom_id_domain_id_url_id_is_wlabp_idx on adroom_stat st
(cost=0.58..313307.30 rows=1287388 width=16) (never executed)
                           Index Cond: ((day >= date_trunc('day'::text,
(now() - '7 days'::interval))) AND (day <= date_trunc('day'::text, now())))
                     ->  Hash  (cost=11.10..11.10 rows=110 width=520)
(never executed)
                           ->  Seq Scan on domains d  (cost=0.00..11.10
rows=110 width=520) (never executed)
               ->  Hash  (cost=2.00..2.00 rows=100 width=40) (actual
time=0.007..0.007 rows=0 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 8kB
                     ->  CTE Scan on b  (cost=0.00..2.00 rows=100 width=40)
(actual time=0.007..0.007 rows=0 loops=1)
 Planning time: 6.641 ms
 Execution time: 0.203 ms


Also I prepared test case for Tom and sent it to him.


Dmitry Shalashov, relap.io & surfingbird.ru

2017-11-22 18:19 GMT+03:00 Tom Lane <t...@sss.pgh.pa.us>:

> Dmitry Shalashov <skau...@gmail.com> writes:
> > BUT if I'll add to 3rd query one additional condition, which is basically
> > 2nd query, it will ran same 12 minutes:
> > SELECT adroom_id, domain_id, shows, clicks FROM adroom_stat WHERE day
> > between date_trunc('day', current_timestamp - interval '1 week') and
> > date_trunc('day', current_timestamp) AND domain_id IN (select distinct
> > unnest(domain_ids) FROM (select title, id, groups->0->>'provider'
> provider,
> > domain_ids from adroom where groups->0->>'provider' ~ '^target_mail_ru'
> and
> > not is_paused and current_timestamp between start_ts and stop_ts) t1)
>
> > Plan of last query:
> >  Nested Loop  (cost=88.63..25617.31 rows=491 width=16) (actual
> > time=3.512..733248.271 rows=1442797 loops=1)
> >    ->  HashAggregate  (cost=88.06..88.07 rows=1 width=4) (actual
> > time=3.380..13.561 rows=3043 loops=1)
> >          Group Key: (unnest(adroom.domain_ids))
> >          ->  HashAggregate  (cost=88.03..88.04 rows=1 width=4) (actual
> > time=2.199..2.607 rows=3043 loops=1)
> >                Group Key: unnest(adroom.domain_ids)
> >                ->  ProjectSet  (cost=0.28..87.78 rows=100 width=4)
> (actual
> > time=0.701..1.339 rows=3173 loops=1)
>
> Hm, seems like the problem is that that lower HashAggregate is estimated
> as having only one row out, which is way off and doesn't sound like a
> particularly bright default estimate anyway.  (And then we're doing an
> additional HashAggregate on top of that, which is useless --- implies
> that something isn't realizing that the output of the SELECT DISTINCT
> is already distinct.)
>
> I'm suspicious that this is breakage from the work that was done on
> targetlist SRFs in v10, but that's just a guess at this point.
>
> Trying simple test queries involving WHERE x IN (SELECT DISTINCT
> unnest(foo) FROM ...), I do not see a behavior like this, so there is some
> not-very-obvious contributing factor in your situation.  Can you put
> together a self-contained test case that produces a bogus one-row
> estimate?  Extra points if it produces duplicate HashAgg steps.
>
>                         regards, tom lane
>

Reply via email to