Hello Felix
Thanks indeed the new query is much faster…The query itself is complicated to 
explain basically you can view it as graph and want to make sure that there is 
no dependencies if I remove a set of points….

explain analyze with filtered_s as ( select s.attvalue  from 
functionalvarattributes s, tags t, variableattributetypes vat where 
t.id=s.tag_id and t.status!='Internal' and vat.fieldtype in 
('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK')  and vat.id=s.atttype_id and 
split_part(split_part(s.attvalue,' ',1),'.',1) in ( select e.name from 
functionalvariables e, usertemplatevariable ut where e.usertemplatevar_id=ut.id 
and ut.usertempl_id=15) ) select s.attvalue from filtered_s s except select 
s.attvalue from filtered_s s , usertemplvarattribute utva, usertemplatevariable 
utv where utv.id=utva.usertempvariable_fk and  utv.usertempl_id=15;
                                                                                
                QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------
HashSetOp Except  (cost=904251.31..2013436.93 rows=200 width=516) (actual 
time=40007.482..40007.482 rows=0 loops=1)
   CTE filtered_s
     ->  Hash Join  (cost=171506.51..904251.31 rows=310110 width=8) (actual 
time=13986.554..40005.687 rows=2 loops=1)
           Hash Cond: (split_part(split_part((s_2.attvalue)::text, ' '::text, 
1), '.'::text, 1) = (e.name)::text)
           ->  Hash Join  (cost=193.91..726311.49 rows=310110 width=8) (actual 
time=2.675..30633.916 rows=308287 loops=1)
                 Hash Cond: (s_2.tag_id = t.id)
                 ->  Hash Join  (cost=188.03..716937.71 rows=1671149 width=16) 
(actual time=2.518..30249.987 rows=651155 loops=1)
                       Hash Cond: (s_2.atttype_id = vat.id)
                       ->  Seq Scan on functionalvarattributes s_2  
(cost=0.00..604679.32 rows=25429032 width=24) (actual time=0.005..1
9229.473 rows=25429808 loops=1)
                       ->  Hash  (cost=183.18..183.18 rows=388 width=8) (actual 
time=2.433..2.433 rows=388 loops=1)
                             Buckets: 1024  Batches: 1  Memory Usage: 16kB
                             ->  Seq Scan on variableattributetypes vat  
(cost=0.00..183.18 rows=388 width=8) (actual time=0.010..2.171
rows=388 loops=1)
                                   Filter: ((fieldtype)::text = ANY 
('{DBF_INLINK,DBF_OUTLINK,DBF_FWDLINK}'::text[]))
                                   Rows Removed by Filter: 5516
                 ->  Hash  (cost=5.43..5.43 rows=36 width=8) (actual 
time=0.147..0.147 rows=36 loops=1)
                       Buckets: 1024  Batches: 1  Memory Usage: 2kB
                       ->  Seq Scan on tags t  (cost=0.00..5.43 rows=36 
width=8) (actual time=0.015..0.119 rows=36 loops=1)
                             Filter: ((status)::text <> 'Internal'::text)
                             Rows Removed by Filter: 158
           ->  Hash  (cost=171251.03..171251.03 rows=4926 width=24) (actual 
time=8939.073..8939.073 rows=16 loops=1)
                 Buckets: 1024  Batches: 1  Memory Usage: 1kB
                 ->  HashAggregate  (cost=171201.77..171251.03 rows=4926 
width=24) (actual time=8939.039..8939.058 rows=16 loops=1)
                       ->  Hash Join  (cost=8.95..171189.45 rows=4926 width=24) 
(actual time=3188.453..8938.943 rows=48 loops=1)
                             Hash Cond: (e.usertemplatevar_id = ut.id)
                             ->  Seq Scan on functionalvariables e  
(cost=0.00..155513.72 rows=4164672 width=32) (actual time=0.004..65
54.351 rows=4164350 loops=1)
                             ->  Hash  (cost=8.75..8.75 rows=16 width=8) 
(actual time=0.042..0.042 rows=16 loops=1)
                                   Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                   ->  Index Scan using usertemp_utv_idx on 
usertemplatevariable ut  (cost=0.29..8.75 rows=16 width=8)
(actual time=0.015..0.029 rows=16 loops=1)
                                         Index Cond: (usertempl_id = 15)
   ->  Append  (cost=0.00..999159.97 rows=44010259 width=516) (actual 
time=13986.564..40007.199 rows=320 loops=1)
         ->  Subquery Scan on "*SELECT* 1"  (cost=0.00..9303.30 rows=310110 
width=516) (actual time=13986.563..40005.703 rows=2 loops=1
)
               ->  CTE Scan on filtered_s s  (cost=0.00..6202.20 rows=310110 
width=516) (actual time=13986.561..40005.699 rows=2 loops=
1)
         ->  Subquery Scan on "*SELECT* 2"  (cost=0.70..989856.67 rows=43700149 
width=516) (actual time=0.071..1.242 rows=318 loops=1)
               ->  Nested Loop  (cost=0.70..552855.18 rows=43700149 width=516) 
(actual time=0.069..0.941 rows=318 loops=1)
                     ->  CTE Scan on filtered_s s_1  (cost=0.00..6202.20 
rows=310110 width=516) (actual time=0.003..0.005 rows=2 loops=
1)
                     ->  Materialize  (cost=0.70..84.46 rows=141 width=0) 
(actual time=0.032..0.331 rows=159 loops=2)
                           ->  Nested Loop  (cost=0.70..83.75 rows=141 width=0) 
(actual time=0.053..0.426 rows=159 loops=1)
                                 ->  Index Scan using usertemp_utv_idx on 
usertemplatevariable utv  (cost=0.29..8.75 rows=16 width=8) (
actual time=0.030..0.052 rows=16 loops=1)
                                       Index Cond: (usertempl_id = 15)
                                ->  Index Only Scan using 
usertemplvarattribute_atttypeid_key on usertemplvarattribute utva  (cost=0.4
2..4.60 rows=9 width=8) (actual time=0.005..0.011 rows=10 loops=16)
                                       Index Cond: (usertempvariable_fk = 
utv.id)
                                       Heap Fetches: 0
Total runtime: 40007.716 ms


Lana
From: Félix GERZAGUET [mailto:felix.gerzag...@gmail.com]
Sent: 27 July 2016 11:16
To: Abadie Lana
Cc: Martín Marqués; Tom Lane; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Very slow query (3-4mn) on a table with 25millions rows

Hello Lana,

On Wed, Jul 27, 2016 at 8:03 AM, Abadie Lana 
<lana.aba...@iter.org<mailto:lana.aba...@iter.org>> wrote:
Here the result of explain (analyse, buffer). Thanks for your help and let me 
know if you need more information.

I noticed 3 things in your query:
1. In the second part (after the except), the 2 tables utva and utv are not 
joined against the others table. Is there a missing join somewhere ?

Let that snipset:

select s.attvalue
  from functionalvarattributes s
     , tags t
     , variableattributetypes vat
 where t.id<http://t.id>=s.tag_id
   and t.status!='Internal'
   and vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK')
   and vat.id<http://vat.id>=s.atttype_id
   and split_part(split_part(s.attvalue,' ',1),'.',1) in ( select 
e.name<http://e.name>
                                                             from 
functionalvariables e
                                                                , 
usertemplatevariable ut
                                                            where 
e.usertemplatevar_id=ut.id<http://ut.id>
                                                              and 
ut.usertempl_id=15
                                                           )
be called A
Let that snipset:

select *
  from usertemplvarattribute utva
     , usertemplatevariable utv
  where utv.id<http://utv.id>=utva.usertempvariable_fk
    and utv.usertempl_id=15
be called B
Then you query is:
A
except
A CROSS JOIN B
If B is not the empty set, than the above query is guaranteed to always have 0 
row.

2. Assuming your query is right (even if I failed to understand its point), we 
could only do the A snipset once instead of twice using a with clause as in:

with filtered_s as (
select s.attvalue
  from functionalvarattributes s
     , tags t
     , variableattributetypes vat
 where t.id<http://t.id>=s.tag_id
   and t.status!='Internal'
   and vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK')
   and vat.id<http://vat.id>=s.atttype_id
   and split_part(split_part(s.attvalue,' ',1),'.',1) in ( select 
e.name<http://e.name>
                                                             from 
functionalvariables e
                                                                , 
usertemplatevariable ut
                                                            where 
e.usertemplatevar_id=ut.id<http://ut.id>
                                                              and 
ut.usertempl_id=15
                                                           )
)
select s.attvalue
  from filtered_s s
except
select s.attvalue
  from filtered_s s
     , usertemplvarattribute utva
     , usertemplatevariable utv
  where utv.id<http://utv.id>=utva.usertempvariable_fk
    and utv.usertempl_id=15
;
This rewritten query should run about 2x. faster.
3. The planner believe that the e.name<http://e.name> subselect will give 4926 
rows (instead of 16 in reality), due to this wrong estimate it will consider 
the vat_funcvaratt_multi_idx index as not usefull. I don't know how to give the 
planner more accurate info ...

--
Félix

Reply via email to