[PERFORM] multi-layered view join performance oddities

2005-10-30 Thread Svenne Krap

Hi there.

I have tried to implement the layered views as suggested earlier on one 
of the simplest queries (just to get a feel for it). And there seems to 
be something odd going on.


Attached are all the statemens needed to see, how the database is made 
and the contents of postgresql.conf and two explain analyzes:


The machine is a single cpu Xeon, with 2G of memory and 2 scsi-drives in 
a mirror (is going to be extended to 6 within a few weeks) running 
8.1beta3.  The whole database has been vacuum analyzed just before the 
explain analyzes.


I have spend a few hours fiddling around with the performance of it, but 
seems to go nowhere - I might have become snowblind and missed something 
obvious though.


There are a few things, that strikes me:
- the base view (ord_result_pct) is reasonable fast (41 ms) - it does a 
lot of seq scans, but right now there are not enough data there to do 
otherwise
- the pretty version (for output) is 17,5 times slower (722ms) even 
though it just joins against three tiny tables (  100 rows each) and 
the plan seems very different
- the slow query (the _pretty) has lower expected costs as the other ( 
338 vs 487 performance units) , this looks like some cost parameters 
need tweaking. I cannot figure out which though.
- the top nested loop seems to eat most of the time, I have a little 
trouble seeing what this nested loop is doing there anyways


Thanks in advance

Svenne
create table nb_property_type(  
id integer not null,
description_dk varchar not null,
description_us varchar not null,
primary key(id)
);

--- 8 rows in nb_property_type, not growing


create table groups (
id int4 not null default nextval('role_id_seq'),
groupname varchar not null,
is_home_group bool not null default 'f'::bool,
valid bool not null default 't'::bool,
created_at timestamp not null default current_timestamp,
changed_at timestamp,
stopped_at timestamp,
primary key(id));

-- at the moment approx. 20 rows, expected a few hundres when going online

create table ord_dataset(
id serial,
first_observation date not null,
last_observation date,
is_mainline bool not null default 't',
is_visible bool not null default 'f',
description_dk varchar,
description_us varchar,
created_by int4 not null references users,
created_at timestamp not null default current_timestamp,
primary key(id)
);

create unique index ord_dataset_fo_idx on ord_dataset(first_observation) where 
is_mainline = 't';

-- approx. 35 rows, growing 4 rows each year


create table ord_entrydata_current(
dataset_id integer not null references ord_dataset,
institut integer not null references groups,
nb_property_type_id int4 not null references nb_property_type,
amount int8 not null
);

create index ord_ed_cur_dataset_id on ord_entrydata_current(dataset_id);
create index ord_ed_cur_institut on ord_entrydata_current(institut);
create index ord_ed_cur_propertytype on 
ord_entrydata_current(nb_property_type_id);


-- filled by a trigger, approx. 3,000 rows, grows approx. 250 rows each year

create view ord_property_type_sums as
 SELECT ord_entrydata_current.dataset_id, 0 AS nb_property_type_id, 
ord_entrydata_current.institut, sum(ord_entrydata_current.amount) AS amount
   FROM ord_entrydata_current
  GROUP BY ord_entrydata_current.dataset_id, ord_entrydata_current.institut;

create view ord_property_type_all as
 SELECT ord_property_type_sums.dataset_id, 
ord_property_type_sums.nb_property_type_id, ord_property_type_sums.institut, 
ord_property_type_sums.amount
   FROM ord_property_type_sums
UNION ALL
 SELECT ord_entrydata_current.dataset_id, 
ord_entrydata_current.nb_property_type_id, ord_entrydata_current.institut, 
ord_entrydata_current.amount
   FROM ord_entrydata_current;

create view ord_institutes_sum as
 SELECT ord_property_type_all.dataset_id, 
ord_property_type_all.nb_property_type_id, 0 AS institut, 
sum(ord_property_type_all.amount) AS amount
   FROM ord_property_type_all
 GROUP BY ord_property_type_all.dataset_id, 
ord_property_type_all.nb_property_type_id;

create view ord_result_pct as
 SELECT t1.dataset_id, t1.nb_property_type_id, t1.institut, t1.amount / 
t2.amount * 100::numeric AS pct
   FROM ord_property_type_all t1, ord_institutes_sum t2
  WHERE t1.dataset_id = t2.dataset_id AND t1.nb_property_type_id = 
t2.nb_property_type_id;

create view ord_result_pct_pretty as
select od.id, od.first_observation, od.description_dk as dsd_dk, 
od.description_us as dsd_us ,g.groupname,orp.institut, orp.nb_property_type_id, 
npt.description_dk as pd_dk, npt.description_us as pd_us, pct   from 
ord_result_pct orp, ord_dataset od, nb_property_type npt, groups g
where orp.dataset_id = od.id and orp.institut = g.id and 
orp.nb_property_type_id = npt.id and od.is_visible = 't'::bool;


-- contents of postgresql.conf 


listen_addresses = 'localhost'
port = 5432

Re: [PERFORM] multi-layered view join performance oddities

2005-10-30 Thread Steinar H. Gunderson
On Sun, Oct 30, 2005 at 06:16:04PM +0100, Svenne Krap wrote:
  Nested Loop  (cost=223.09..338.61 rows=1 width=174) (actual 
 time=20.213..721.361 rows=2250 loops=1)
Join Filter: ((outer.dataset_id = inner.dataset_id) AND 
 (outer.nb_property_type_id = inner.nb_property_type_id))
-  Hash Join  (cost=58.04..164.26 rows=1 width=150) (actual 
 time=5.510..22.088 rows=2250 loops=1)

There's horrible misestimation here. It expects one row and thus starts a
nested loop, but gets 2250. No wonder it's slow :-)

The misestimation can be traced all the way down here:

  Hash Cond: (outer.institut = inner.id)
  -  Hash Join  (cost=56.88..163.00 rows=16 width=137) (actual 
 time=5.473..19.165 rows=2250 loops=1)
Hash Cond: (outer.dataset_id = inner.id)
-  Hash Join  (cost=55.48..160.95 rows=99 width=101) (actual 
 time=5.412..16.264 rows=2250 loops=1)

where the planner misestimates the selectivity of your join (it estimates 99
rows, and there are 2250).

I've had problems joining with Append nodes in the past, and solved the
problem by moving the UNION ALL a bit out, but I'm not sure if it's a very
good general solution, or a solution to your problems here.

If all else fails, you could set enable_nestloop=false, but that is not a
good idea in the long run, I'd guess -- it's much better to make sure the
planner has good estimates and let it do the correct decisions from there.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] multi-layered view join performance oddities

2005-10-30 Thread Tom Lane
Svenne Krap [EMAIL PROTECTED] writes:
 create view ord_institutes_sum as
  SELECT ord_property_type_all.dataset_id, 
 ord_property_type_all.nb_property_type_id, 0 AS institut, 
 sum(ord_property_type_all.amount) AS amount
FROM ord_property_type_all
  GROUP BY ord_property_type_all.dataset_id, 
 ord_property_type_all.nb_property_type_id;

 create view ord_result_pct as
  SELECT t1.dataset_id, t1.nb_property_type_id, t1.institut, t1.amount / 
 t2.amount * 100::numeric AS pct
FROM ord_property_type_all t1, ord_institutes_sum t2
   WHERE t1.dataset_id = t2.dataset_id AND t1.nb_property_type_id = 
 t2.nb_property_type_id;

This is really pretty horrid code: you're requesting double evaluation
of the ord_property_type_all view, and then joining the two calculations
to each other.  No, the planner will not detect how silly this is :-(,
nor will it realize that there's guaranteed to be a match for every row
--- I believe the latter is the reason for the serious misestimation
that Steinar noted.  The misestimation doesn't hurt particularly when
evaluating ord_result_pct by itself, because there are no higher-level
decisions to make ... but it hurts a lot when you join ord_result_pct to
some other stuff.

It seems like there must be a way to get the percentage amounts with
only one evaluation of ord_property_type_all, but I'm not seeing it
right offhand.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] multi-layered view join performance oddities

2005-10-30 Thread Svenne Krap




Hi.

Your suggestion with disableing the nested loop really worked well: 

rkr=# set enable_nestloop=false;
SET
rkr=# explain analyze select * from ord_result_pct_pretty ;
   
QUERY PLAN
---
 Hash Join  (cost=230.06..337.49 rows=1 width=174) (actual
time=21.893..42.356 rows=2250 loops=1)
   Hash Cond: (("outer".dataset_id = "inner".dataset_id) AND
("outer".nb_property_type_id = "inner".nb_property_type_id))
   -  Hash Join  (cost=56.94..164.10 rows=26 width=93) (actual
time=5.073..17.906 rows=2532 loops=1)
 Hash Cond: ("outer".dataset_id = "inner".id)
 -  Hash Join  (cost=55.54..161.63 rows=161 width=57)
(actual time=4.996..14.775 rows=2532 loops=1)
   Hash Cond: ("outer".institut = "inner".id)
   -  Append  (cost=54.38..121.72 rows=2476 width=44)
(actual time=4.964..11.827 rows=2532 loops=1)
 -  HashAggregate  (cost=54.38..57.20 rows=226
width=16) (actual time=4.964..5.174 rows=282 loops=1)
   -  Seq Scan on ord_entrydata_current 
(cost=0.00..37.50 rows=2250 width=16) (actual time=0.002..1.305
rows=2250 loops=1)
 -  Subquery Scan "*SELECT* 2" 
(cost=0.00..60.00 rows=2250 width=20) (actual time=0.009..4.948
rows=2250 loops=1)
   -  Seq Scan on ord_entrydata_current 
(cost=0.00..37.50 rows=2250 width=20) (actual time=0.003..2.098
rows=2250 loops=1)
   -  Hash  (cost=1.13..1.13 rows=13 width=17) (actual
time=0.022..0.022 rows=13 loops=1)
 -  Seq Scan on groups g  (cost=0.00..1.13
rows=13 width=17) (actual time=0.003..0.013 rows=13 loops=1)
 -  Hash  (cost=1.32..1.32 rows=32 width=36) (actual
time=0.070..0.070 rows=32 loops=1)
   -  Seq Scan on ord_dataset od  (cost=0.00..1.32
rows=32 width=36) (actual time=0.009..0.043 rows=32 loops=1)
 Filter: is_visible
   -  Hash  (cost=173.07..173.07 rows=10 width=97) (actual
time=15.472..15.472 rows=256 loops=1)
 -  Hash Join  (cost=166.15..173.07 rows=10 width=97)
(actual time=14.666..15.203 rows=256 loops=1)
   Hash Cond: ("outer".nb_property_type_id = "inner".id)
   -  HashAggregate  (cost=165.05..168.15 rows=248
width=40) (actual time=14.619..14.849 rows=288 loops=1)
 -  Append  (cost=54.38..121.72 rows=2476
width=44) (actual time=5.012..11.130 rows=2532 loops=1)
   -  HashAggregate  (cost=54.38..57.20
rows=226 width=16) (actual time=5.011..5.222 rows=282 loops=1)
 -  Seq Scan on
ord_entrydata_current  (cost=0.00..37.50 rows=2250 width=16) (actual
time=0.001..1.261 rows=2250 loops=1)
   -  Subquery Scan "*SELECT* 2" 
(cost=0.00..60.00 rows=2250 width=20) (actual time=0.010..4.308
rows=2250 loops=1)
 -  Seq Scan on
ord_entrydata_current  (cost=0.00..37.50 rows=2250 width=20) (actual
time=0.002..1.694 rows=2250 loops=1)
   -  Hash  (cost=1.08..1.08 rows=8 width=57) (actual
time=0.026..0.026 rows=8 loops=1)
 -  Seq Scan on nb_property_type npt 
(cost=0.00..1.08 rows=8 width=57) (actual time=0.004..0.019 rows=8
loops=1)
 Total runtime: 43.297 ms
(28 rows)

Now, the whole question becomes, how do I get the planner to make a
better estimation of the returned rows. 

I am not sure, I can follow your moving-the-union-all-further-out
advice, as I see no different place for the unioning of the two
datasets.

Maybe one of the core devs know, where to fiddle :)

Svenne

Steinar H. Gunderson wrote:

  On Sun, Oct 30, 2005 at 06:16:04PM +0100, Svenne Krap wrote:
  
  
 Nested Loop  (cost=223.09..338.61 rows=1 width=174) (actual time=20.213..721.361 rows=2250 loops=1)
   Join Filter: (("outer".dataset_id = "inner".dataset_id) AND ("outer".nb_property_type_id = "inner".nb_property_type_id))
   -  Hash Join  (cost=58.04..164.26 rows=1 width=150) (actual time=5.510..22.088 rows=2250 loops=1)

  
  
There's horrible misestimation here. It expects one row and thus starts a
nested loop, but gets 2250. No wonder it's slow :-)

The misestimation can be traced all the way down here:

  
  
 Hash Cond: ("outer".institut = "inner".id)
 -  Hash Join  (cost=56.88..163.00 rows=16 width=137) (actual time=5.473..19.165 rows=2250 loops=1)
   Hash Cond: ("outer".dataset_id = "inner".id)
   -  Hash Join  (cost=55.48..160.95 rows=99 width=101) (actual time=5.412..16.264 rows=2250 loops=1)

  
  
where the planner misestimates the selectivity of your join (it estimates 99
rows, and there are 2250).

I've had problems joining with Append nodes in the past, and solved the
problem by moving the UNION ALL a bit out, but I'm not sure if it's 

Re: [PERFORM] multi-layered view join performance oddities

2005-10-30 Thread Svenne Krap




Tom Lane wrote:

  Svenne Krap [EMAIL PROTECTED] writes:
  
  
create view ord_institutes_sum as
 SELECT ord_property_type_all.dataset_id, ord_property_type_all.nb_property_type_id, 0 AS institut, sum(ord_property_type_all.amount) AS amount
   FROM ord_property_type_all
 GROUP BY ord_property_type_all.dataset_id, ord_property_type_all.nb_property_type_id;

  
  
  
  
create view ord_result_pct as
 SELECT t1.dataset_id, t1.nb_property_type_id, t1.institut, t1.amount / t2.amount * 100::numeric AS pct
   FROM ord_property_type_all t1, ord_institutes_sum t2
  WHERE t1.dataset_id = t2.dataset_id AND t1.nb_property_type_id = t2.nb_property_type_id;

  
  
This is really pretty horrid code: you're requesting double evaluation
of the ord_property_type_all view, and then joining the two calculations
to each other.  No, the planner will not detect how silly this is :-(,
nor will it realize that there's guaranteed to be a match for every row
--- I believe the latter is the reason for the serious misestimation
that Steinar noted.  The misestimation doesn't hurt particularly when
evaluating ord_result_pct by itself, because there are no higher-level
decisions to make ... but it hurts a lot when you join ord_result_pct to
some other stuff.
  

I don't really see, how this query is horrid from a user perspective,
this is exactly the way, the percentage has to be calculated from a
"philosophical" standpoint (performance considerations left out). 
This is very bad news for me, as most of the other (much larger)
queries have the same issue, that the views will be used multiple times
got get slightly different data, that has to be joined (also more than
2 times as in this case)

I think, it has to run multiple times as it returns two different types
of data. 


  It seems like there must be a way to get the percentage amounts with
only one evaluation of ord_property_type_all, but I'm not seeing it
right offhand.
  


I will think about how to remove the second evaluation of the view in
question, if anyone knows how, a hint is very appriciated :)

I could of course go the "materialized view" way, but would really
prefer not to.

Svenne