On Thu, 2004-11-18 at 02:08, David Brown wrote:
> Statistics were run on each table before query execution. The random page
> cost was lowered to 2, but as you can see, the estimated costs are wild
> As a comparison, MS SQL Server took less than 15 seconds, or 4 times faster.
> MySQL (InnoDB) took 2 seconds, which is 30 times faster.
> The query looks straightforward to me (it might be clearer with a subselect),
> so what on earth is wrong?
The query is, as you say, straightforward.
You are clearly working with a query that is on the very edge of the
decision between using an index or not.
The main issue is that PostgreSQL's default histogram statistics setting
is lower than other RDBMS. This means that it is less able to
discriminate between cases such as yours that are close to the edge.
This is a trade-off between run-time of the ANALYZE command and the
benefit it produces. As Joshua suggests, increasing the statistics
target for this table will likely allow the optimizer to correctly
determine the selectivity of the index and take the right path.
If this is a general RDBMS comparison, you may wish to extend the
system's default_statistics_target = 80 or at least > 10.
To improve this query, you may wish to extend the table's statistics
ALTER TABLE "drinv"
ALTER COLUMN OrderDate SET STATISTICS 100;
which should allow the planner to more accurately estimate statistics
and thereby select an index, if appropriate.
The doco has recently been changed with regard to effective_cache_size;
you don't mention what beta release level you're using. That is the only
planner parameter that takes cache size into account, so any other
changes would certainly have zero effect on this *plan* though might
still benefit execution time.
Please post EXPLAIN ANALYZE output for any further questions.
Best Regards, Simon Riggs
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match