On 8 Oct 2004 at 16:04, Tom Lane wrote:

> "Gary Doades" <[EMAIL PROTECTED]> writes:
> > If I remove the redundant clauses, the planner now estimates 1000 rows returned 
> > from 
> > the table, not unreasonable since it has no statistics. But *why* in that case, 
> > with *more* 
> > estimated rows does it choose to materialize that table (26 rows) 573416 times!!!
> It isn't.  It's materializing that once and scanning it 573416 times,
> once for each row in the outer relation.  And this is not a bad plan
> given the estimates.  If it had stuck to what you call the good plan,
> and there *had* been 1000 rows in the temp table, that plan would have
> run 1000 times longer than it did.
> As a general rule, if your complaint is that you get a bad plan for an
> unanalyzed table, the response is going to be "so analyze the table".

The problem is in this case is that if I *do* analyse the table I *always* get the bad 
Bad in this case meaning the query takes a lot longer. I'm still not sure why it can't 
choose the better plan by just reading the 26 rows once and index scan the 
booking_plan table 26 times (as in the "good" plan).

OK, with 1000 row estimate I can see that index scanning 1000 times into the 
booking_plan table would take some time, but the even if planner estimates 5 rows it 
produces the same slow query.

If I analyze the table it then knows there are 26 rows and therefore always goes slow.

This is why I am not analyzing this table, to fool the planner into thinking there is 
one row and produce a much faster access plan. Not ideal I know.

Just using one redundant clause I now get:

select distinct b.staff_id  from staff_booking b, booking_plan bp, t_search_reqt_dates 
where b.booking_id = bp.booking_id
and rd.datetime_from <= bp.datetime_to and rd.datetime_to >= bp.datetime_from
AND bp.booking_date between rd.reqt_date-1 and rd.reqt_date+1
and rd.search_id = 13

Unique  (cost=50885.97..50921.37 rows=462 width=4) (actual 
time=35231.000..35241.000 rows=110 loops=1)
  ->  Sort  (cost=50885.97..50903.67 rows=35397 width=4) (actual 
time=35231.000..35241.000 rows=2173 loops=1)
        Sort Key: b.staff_id
        ->  Hash Join  (cost=44951.32..50351.07 rows=35397 width=4) (actual 
time=34530.000..35231.000 rows=2173 loops=1)
              Hash Cond: ("outer".booking_id = "inner".booking_id)
              ->  Seq Scan on staff_booking b  (cost=0.00..4233.39 rows=197877 
(actual time=0.000..351.000 rows=197877 loops=1)
              ->  Hash  (cost=44933.62..44933.62 rows=35397 width=4) (actual 
time=34530.000..34530.000 rows=0 loops=1)
                    ->  Nested Loop  (cost=15.50..44933.62 rows=35397 width=4) (actual 
time=8342.000..34520.000 rows=2173 loops=1)
                          Join Filter: (("inner".datetime_from <= "outer".datetime_to) 
("inner".datetime_to >= "outer".datetime_from) AND ("outer".booking_date >= 
("inner".reqt_date - 1)) AND ("outer".booking_date <= ("inner".reqt_date + 1)))
                          ->  Seq Scan on booking_plan bp  (cost=0.00..7646.08 
width=24) (actual time=0.000..1053.000 rows=573416 loops=1)
                          ->  Materialize  (cost=15.50..15.53 rows=5 width=20) (actual 
time=0.001..0.019 rows=26 loops=573416)
                                ->  Seq Scan on t_search_reqt_dates rd  
(cost=0.00..15.50 rows=5 
width=20) (actual time=0.000..0.000 rows=26 loops=1)
                                      Filter: (search_id = 13)
Total runtime: 35241.000 ms

If this is the only answer for now, then fair enough I will just have to do more 


---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Reply via email to