Re: [PERFORM] severe performance issue with planner

2004-03-28 Thread Tom Lane
Eric Brown [EMAIL PROTECTED] writes:
 Here's the query as I have changed it now:

Now that you've switched to JOIN syntax, you can cut the planning time
to nil by setting join_collapse_limit to 1.  See
http://www.postgresql.org/docs/7.4/static/explicit-joins.html

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] severe performance issue with planner (fwd)

2004-03-17 Thread Tom Lane
Kris Jurka [EMAIL PROTECTED] writes:
 On Thu, 11 Mar 2004, Tom Lane wrote:
 Eric Brown [EMAIL PROTECTED] writes:
 [ planning a 9-table query takes too long ]
 
 See http://www.postgresql.org/docs/7.4/static/explicit-joins.html
 for some useful tips.

 Is this the best answer we've got?  For me with an empty table this query 
 takes 4 seconds to plan, is that the expected planning time?  I know I've 
 got nine table queries that don't take that long.

The problem with this example is that it's a nine-way self-join.
Ordinarily the planner can eliminate many possible join paths at low
levels, because they are more expensive than other available options.
But in this situation all the available options have *exactly the same
cost estimate* because they are all founded on exactly the same statistics.
The planner fails to prune any of them and ends up making a random
choice after examining way too many alternatives.

Maybe we should think about instituting a hard upper limit on the number
of alternatives considered.  But I'm not sure what the consequences of
that would be.  In the meantime, the answer for the OP is to arbitrarily
limit the number of join orders considered, as described in the
above-mentioned web page.  With the given query constraints there's
really only one join order worth thinking about ...

 Setting geqo_threshold less than 9, it takes 1 second to plan.  Does this 
 indicate that geqo_threshold is set too high, or is it a tradeoff between 
 planning time and plan quality?

Selecting the GEQO planner doesn't really matter here, because it has
no better clue about how to choose among a lot of alternatives with
identical cost estimates.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] severe performance issue with planner (fwd)

2004-03-16 Thread Kris Jurka

I sent this message to the list and although it shows up in the archives,
I did not receive a copy of it through the list, so I'm resending as I
suspect others did not see it either.

-- Forwarded message --
Date: Sat, 13 Mar 2004 22:48:01 -0500 (EST)
From: Kris Jurka [EMAIL PROTECTED]
To: Tom Lane [EMAIL PROTECTED]
Cc: Eric Brown [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Re: [PERFORM] severe performance issue with planner 

On Thu, 11 Mar 2004, Tom Lane wrote:

 Eric Brown [EMAIL PROTECTED] writes:
  [ planning a 9-table query takes too long ]
 
 See http://www.postgresql.org/docs/7.4/static/explicit-joins.html
 for some useful tips.
 

Is this the best answer we've got?  For me with an empty table this query 
takes 4 seconds to plan, is that the expected planning time?  I know I've 
got nine table queries that don't take that long.

Setting geqo_threshold less than 9, it takes 1 second to plan.  Does this 
indicate that geqo_threshold is set too high, or is it a tradeoff between 
planning time and plan quality?  If the planning time is so high because 
the are a large number of possible join orders, should geqo_threhold be 
based on the number of possible plans somehow instead of the number of 
tables involved?

Kris Jurka


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] severe performance issue with planner

2004-03-12 Thread Greg Stark

The other posts about using explicit joins and using stored procedures are
both good points. But I have a few other comments to make:

Eric Brown [EMAIL PROTECTED] writes:

 WHERE
 w0.wid  0 AND
 w0.pinyin = 'zheng4' AND
 w0.def_exists = 't' AND
 w0.sequence = 0 AND
 w1.wid = w0.wid AND
 w1.pinyin LIKE 'fu_' AND
 w1.variant = w0.variant AND
 w1.sequence = (w0.sequence + 1) AND

I'm not sure it'll help the planner, but w0.sequence+1 is always just going to
be 1, and so on with the others. I think the planner might be able to figure
that out but the plan doesn't seem to show it doing so. I'm not sure it would
help the plan though.

Similarly you have w1.wid=w0.wid and w2.wid=w1.wid and w3.wid=w2.wid etc. And
also with the variant column. You might be able to get this planned better
by writing it as a join from w0 to all the others rather than a chain of
w0-w1-w2-... Again I'm not sure; you would have to experiment.


But I wonder if there isn't a way to do this in a single pass using an
aggregate. I'm not sure I understand the schema exactly, but perhaps something
like this? 

select w8.wid,
   w8.variant,
   w8.num_variants,
   sum_text(w8.unicode) as unicodes,
   sum_text(w8.pinyin) as pinyins
  from (
select wid,variant,
  from words 
 where (sequence = 0 and pinyin = 'zheng4')
OR (sequence = 1 and pinyin like 'ji_')
OR (sequence = 2 and pinyin like 'guan_')
OR (sequence = 3 and pinyin like 'kai_')
OR (sequence = 4 and pinyin like 'fang_')
OR (sequence = 5 and pinyin like 'xi_')
OR (sequence = 6 and pinyin like 'tong_')
OR (sequence = 7 and pinyin like 'fu_')
group by wid,variant
having count(*) = 8
   ) as w
  join words as w8 using (wid,variant)

This might be helped by having an index on sequence,pinyin but it might not
even need it.


-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]