Re: [PERFORM] Join optimisation Quandry

2004-01-18 Thread Ceri Storey
On Fri, Jan 16, 2004 at 10:05:54PM -0800, Stephan Szabo wrote:
 Well the plan would seems reasonable to me if there really was only 1 row
 coming from the where conditions on p1.  As a first step, if you raise the
 statistics target (see ALTER TABLE) for prog_start and prog_stop and
 re-analyze the table, do you get a better estimate of the rows from that
 condition? 

Indeed, that helps a fair bit with the estimate.

 Secondly, what do you get if you enable_nestloop=off before
 explain analyzing the query?
See attachment.

Saying that, I've managed to halve the query time by lifting the join of
the title out of the RHS of the left outer join into the top-level of
the FROM clause; which was really the kind of advice I was after. If
this is the wrong list for that kind of thing, please say so.

Again, thanks.
-- 
Ceri Storey [EMAIL PROTECTED]
 Sort  (cost=2144.43..2148.83 rows=1762 width=348) (actual time=993.292..993.329 
rows=18 loops=1)
   Sort Key: public.programme.chan_id
   -  Hash Left Join  (cost=1261.53..2049.43 rows=1762 width=348) (actual 
time=991.208..992.905 rows=18 loops=1)
 Hash Cond: (outer.prog_next = inner.prog_id)
 -  Hash Join  (cost=250.81..596.33 rows=1762 width=271) (actual 
time=24.193..176.767 rows=18 loops=1)
   Hash Cond: (outer.chan_id = inner.chan_id)
   -  Merge Join  (cost=249.56..568.64 rows=1762 width=77) (actual 
time=23.710..171.313 rows=18 loops=1)
 Merge Cond: (outer.title_id = inner.title_id)
 -  Index Scan using $3 on title  (cost=0.00..266.13 rows=10683 
width=27) (actual time=0.047..81.038 rows=10650 loops=1)
 -  Sort  (cost=249.56..253.96 rows=1762 width=54) (actual 
time=23.254..23.298 rows=18 loops=1)
   Sort Key: public.programme.title_id
   -  Index Scan using prog_stop_idx on programme  
(cost=0.00..154.56 rows=1762 width=54) (actual time=0.105..23.107 rows=18 loops=1)
 Index Cond: (prog_stop  '2004-01-14 
19:52:46'::timestamp without time zone)
 Filter: (prog_start = '2004-01-14 
19:52:46'::timestamp without time zone)
   -  Hash  (cost=1.20..1.20 rows=20 width=198) (actual time=0.354..0.354 
rows=0 loops=1)
 -  Seq Scan on channel  (cost=0.00..1.20 rows=20 width=198) 
(actual time=0.062..0.252 rows=20 loops=1)
 -  Hash  (cost=887.51..887.51 rows=8085 width=77) (actual 
time=782.379..782.379 rows=0 loops=1)
   -  Hash Join  (cost=285.54..887.51 rows=8085 width=77) (actual 
time=212.806..703.331 rows=8085 loops=1)
 Hash Cond: (outer.title_id = inner.title_id)
 -  Seq Scan on programme  (cost=0.00..173.85 rows=8085 width=54) 
(actual time=0.023..45.351 rows=8085 loops=1)
 -  Hash  (cost=190.83..190.83 rows=10683 width=27) (actual 
time=212.350..212.350 rows=0 loops=1)
   -  Seq Scan on title  (cost=0.00..190.83 rows=10683 
width=27) (actual time=0.061..131.381 rows=10767 loops=1)
 Total runtime: 995.482 ms
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Join optimisation Quandry

2004-01-18 Thread Ceri Storey
On Fri, Jan 16, 2004 at 10:17:50AM -0800, Stephan Szabo wrote:
 As a starting point, we're likely to need the exact query, explain analyze
 output for the query and version information.

Okay, from top to bottom:

  SELECT p1.chan_name, p1.prog_start AS now_start, p1.prog_id, p1.title_text, 
p2.prog_start AS next_start, p2.prog_id, p2.title_text, 
p1.title_wanted, p2.title_wanted, p1.chan_id
  FROM (programme natural join channel NATURAL JOIN title) AS p1 
LEFT OUTER JOIN (programme NATURAL JOIN title) AS p2 
ON p1.prog_next = p2.prog_id 
  WHERE p1.prog_start = timestamp 'now' AND p1.prog_stop  timestamp 'now'
  ORDER BY p1.chan_id ASC

QUERY PLAN

 Sort  (cost=983.38..983.38 rows=1 width=85) (actual time=10988.525..10988.557 rows=17 
loops=1)
   Sort Key: public.programme.chan_id
   -  Nested Loop Left Join  (cost=289.86..983.37 rows=1 width=85) (actual 
time=631.918..10988.127 rows=17 loops=1)
 Join Filter: (outer.prog_next = inner.prog_id)
 -  Nested Loop  (cost=4.33..9.12 rows=1 width=55) (actual time=4.111..7.960 
rows=17 loops=1)
   -  Hash Join  (cost=4.33..5.64 rows=1 width=37) (actual 
time=4.017..5.182 rows=17 loops=1)
 Hash Cond: (outer.chan_id = inner.chan_id)
 -  Seq Scan on channel  (cost=0.00..1.20 rows=20 width=17) 
(actual time=0.017..0.403 rows=20 loops=1)
 -  Hash  (cost=4.32..4.32 rows=1 width=24) (actual 
time=3.910..3.910 rows=0 loops=1)
   -  Index Scan using prog_stop_idx on programme  
(cost=0.00..4.32 rows=1 width=24) (actual time=0.140..3.809 rows=17 loops=1)
 Index Cond: (prog_stop  '2004-01-17 
01:01:51.786145'::timestamp without time zone)
 Filter: (prog_start = '2004-01-17 
01:01:51.786145'::timestamp without time zone)
   -  Index Scan using $3 on title  (cost=0.00..3.47 rows=1 width=26) 
(actual time=0.078..0.114 rows=1 loops=17)
 Index Cond: (outer.title_id = title.title_id)
 -  Hash Join  (cost=285.54..892.91 rows=6507 width=34) (actual 
time=191.612..586.407 rows=7145 loops=17)
   Hash Cond: (outer.title_id = inner.title_id)
   -  Seq Scan on programme  (cost=0.00..121.07 rows=6507 width=16) 
(actual time=0.036..42.337 rows=7145 loops=17)
   -  Hash  (cost=190.83..190.83 rows=10683 width=26) (actual 
time=190.795..190.795 rows=0 loops=17)
 -  Seq Scan on title  (cost=0.00..190.83 rows=10683 width=26) 
(actual time=0.143..113.223 rows=10715 loops=17)
 Total runtime: 10989.661 ms

And both client and server are:
postgres (PostgreSQL) 7.4.1

Thanks for looking into it.
-- 
Ceri Storey [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Join optimisation Quandry

2004-01-18 Thread Tom Lane
Ceri Storey [EMAIL PROTECTED] writes:
 Although, as I've just found, another bottleneck is the title table.
 PostgreSQL seems to inst on doing a Seq Scan on the entire table. 

-  Seq Scan on tid  (cost=0.00..20.00 rows=1000 width=8) (actual 
 time=0.028..10.457 rows=17 loops=1)

It doesn't look like you've ever vacuumed or analyzed tid --- those
are the default cost and rows estimates.  Although I'm unsure whether
the plan would change much if you had.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]