[ http://issues.apache.org/jira/browse/DERBY-1905?page=all ]
A B updated DERBY-1905:
-----------------------
Attachment: d1905.sql
Attaching file referenced in the description for this issue.
> Optimizer cost estimates for subqueries are way (way) too high.
> ---------------------------------------------------------------
>
> Key: DERBY-1905
> URL: http://issues.apache.org/jira/browse/DERBY-1905
> Project: Derby
> Issue Type: Bug
> Components: Performance, SQL
> Affects Versions: 10.2.1.5, 10.3.0.0, 10.1.3.2
> Reporter: A B
> Attachments: d1905.sql
>
>
> If I run the attached repro (pulled from DERBY-1866) with
> derby.optimizer.noTimeout=true (meaning the optimizer should take all the
> time it needs to try out all possible plans within the restrictions of the
> optimizer overrides), the estimated cost shown in derby.log (if I log the
> query plan) is over 600k and the estimated row count is over 520k.
> If, as the code in OptimizerImpl seems to expect, the unit for a cost
> estimate is milliseconds, then the optimize here is guessing that the query
> will take over 10 minutes to execute and will return a half-million rows.
> But in truth the *combined* time for compilation AND execution is just a
> couple of seconds, and only 15 rows are returned.
> That suggests to me a rather serious problem with the optimizer cost
> estimates for subqueries.
> Among other things this can have a major impact on the optimizer's timeout
> mechanism for very deeply-nested queries. The optimizer will continue to try
> out different combinations of indexes/joinOrders/joinStrategies until it
> either exhausts them all or until the number of milliseconds spent optimizing
> is greater than the "best cost" estimate so far. In the case of the repro for
> this issue, the optimizer quickly exhausts all of the options and so it
> finishes in a fair amount of time.
> But in larger queries where there are far more combinations to try (see esp.
> queries attached to DERBY-1205, DERBY-1777), these severly inflated cost
> estimates get very large very quickly (sometimes even reaching infinity--see
> DERBY-1259, DERBY-1260) and so the optimizer just keeps on optimizing and
> never times out. The result is that for queries like those in DERBY-1777,
> the optimizer can spend literally hours optimizing a query which then
> executes in a matter of seconds.
> I'm still investigating this, but preliminary examination suggests that part
> of the problem is in some way related to the treatment of "outer costs" by
> the optimizer--and in particular, it looks like the optimizer is perhaps too
> aggressive in multiplying cost estimates by row counts pulled from "outer
> costs". That's just my first guess at the problem, though; there could of
> course be far more to it than that...
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira