Re: [PERFORM] Correcting Hash Join Estimates

2005-04-03 Thread Mark Lubratt
On Apr 4, 2005, at 12:54 AM, Tom Lane wrote:
[EMAIL PROTECTED] writes:
I'm trying to optimize a query and the EXPLAIN ANALYZE (see link 
below)
shows that some hash join row estimates are wrong by a factor of 2-3,
and upwards of 7-8.
I doubt that improving those estimates would lead to markedly better
results.  You need to think about improving the view design instead.
What context is this view used in --- do you just do "select * from
view_get_all_user_award2", or are there conditions added to it, or
perhaps it gets joined with other things?
Yes.  I forgot to show how the query is executed...
select * from view_get_all_user_award2 where person_id = 1;

  Do you really need the
DISTINCT constraint?
Yes.
Do you really need the ORDER BY?
The customer wants an initial ordering in the displayed data.
Can you
simplify the WHERE clause at all?
I originally had a bunch of LEFT JOINs.  After reading Tow's "SQL 
Tuning", I was hoping to steer the planner into a more "optimal" plan 
by using a large where clause instead and doing the joins there (I 
think they're called implicit joins).  I was able to shave a couple of 
hundred milliseconds off the execution time by doing this.

Half a second sounds pretty decent to me for a ten-way join with a 
WHERE
clause as unstructured as that.  If you really need it to execute in 
way
less time, you're probably going to have to rethink your data
representation to make the query simpler.

Unfortunately, I'm not sure I can restructure the data.  I did consider 
materialized views.  However, they couldn't be lazy and that seemed 
like a lot of extra work for the backend for very little improvement.

If this sounds like decent performance to you...  I guess I can just 
tell the complainers that it's as good as it's going to get (barring a 
major hardware upgrade...).

Thanks!
Mark
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [PERFORM] Correcting Hash Join Estimates

2005-04-03 Thread Tom Lane
[EMAIL PROTECTED] writes:
> I'm trying to optimize a query and the EXPLAIN ANALYZE (see link below)
> shows that some hash join row estimates are wrong by a factor of 2-3,
> and upwards of 7-8.

I doubt that improving those estimates would lead to markedly better
results.  You need to think about improving the view design instead.
What context is this view used in --- do you just do "select * from
view_get_all_user_award2", or are there conditions added to it, or
perhaps it gets joined with other things?  Do you really need the
DISTINCT constraint?  Do you really need the ORDER BY?  Can you
simplify the WHERE clause at all?

Half a second sounds pretty decent to me for a ten-way join with a WHERE
clause as unstructured as that.  If you really need it to execute in way
less time, you're probably going to have to rethink your data
representation to make the query simpler.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] Correcting Hash Join Estimates

2005-04-03 Thread mark . lubratt
Hello!

I posted a similar question to this one about a month ago; but, for some
reason, it never seemed to be broadcast eventhough it ended up in the
archives.  So, since I'm still struggling with this, I thought I'd
repost...

I'm trying to optimize a query and the EXPLAIN ANALYZE (see link below)
shows that some hash join row estimates are wrong by a factor of 2-3,
and upwards of 7-8. There is a corresponding mis-estimation of the
amount of time taken for these steps. The database is vacuum analyzed
nightly by a cron job.  How would I go about tightening up these
errors?  I suspect that I need to SET STATISTIC on some columns, but
how can I tell which columns?

Any help would be appreciated.

WinXP (dual Xeon 1.2GB RAM) PgSQL 8.0.1
Explain Analyze:  
View Definition: 

The largest table contains about 10,000 rows.  All tables have indexes
on their foreign keys.

Thanks!
Mark



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match