[ http://issues.apache.org/jira/browse/DERBY-1007?page=all ]
A B updated DERBY-1007:
-----------------------
Other Info: [Patch available]
> Optimizer can return incorrect "best cost" estimates with nested subqueries,
> which leads to generation of sub-optimal plans.
> ----------------------------------------------------------------------------------------------------------------------------
>
> Key: DERBY-1007
> URL: http://issues.apache.org/jira/browse/DERBY-1007
> Project: Derby
> Type: Bug
> Components: Performance
> Versions: 10.2.0.0
> Reporter: A B
> Assignee: A B
> Priority: Minor
> Attachments: d1007_v1.patch, d1007_v1.stat
>
> When optimizing a query that has nested subqueries in it, it's possible that
> the optimizer for the subqueries will return cost estimates that are lower
> than what they were actually calculated to be. The result is that the outer
> query can pick an access plan that is sub-optimal.
> Filing this jira issue based on the thread "[OPTIMIZER] OptimizerImpl "best
> plans" for subqueries?" from derby-dev. Description that follows is pasted
> from that email:
> http://article.gmane.org/gmane.comp.apache.db.derby.devel/14836
> Following example of what I saw when tracing through the code demonstrates
> the problem.
> select x1.j, x2.b from
> (select distinct i,j from t1) x1,
> (select distinct a,b from t3) x2
> where x1.i = x2.a;
> During optimization of this query we will create three instancesof
> OptimizerImpl:
> OI_0: For "select x1.j, x2.b from x1, x2 where x1.i = x2.a"
> OI_1: For "select distinct i,j from t1"
> OI_2: For "select distinct a,b from t3"
> Query ran against a clean codeline when T1 had 1 row and T3 had 50,000.
> -- Top-level call is made to the optimize() method of the
> outermost SelectNode, which creates OI_0.
> -- OI_0: picks join order {X1, X2} and calls X1.optimizeIt()
> -- X1: *creates* OI_1 and makes calls to optimize it.
> -- OI_1: picks join order {T1} and calls T1.optimizeIt()
> -- T1: returns a cost of 20.
> -- OI_1: saves 20 as new best cost and tells T1 to save it.
> -- X1: calls OI_1.getOptimizedCost(), which returns 20. X1
> then returns 20 to OI_0.
> -- OI_0: calls X2.optimizeIt()
> -- X2: *creates* OI_2 and makes calls to optimize it.
> -- OI_2: picks join order {T3} and calls T3.optimizeIt()
> -- T3: returns a cost of 64700.
> -- OI_2: saves 64700 as new best cost and tells T3 to save it.
> -- X2: calls OI_2.getOptimizedCost(), which returns 64700. X2
> then returns 64700 to OI_0.
> -- OI_0: saves 20 + 64700 = 64720 as new best cost and tells
> X1 to save 20 and X2 to save 64700.
> -- OI_0: picks join order {X2, X1} and calls X2.optimizeIt()
> -- X2: *fetches* OI_2 and makes calls to optimize it.
> -- OI_2: picks join order {T3} and calls T3.optimizeIt()
> -- T3: returns a cost of 10783.
> -- OI_2: saves 10783 as new best cost and tells T3 to save it.
> -- X2: calls OI_2.getOptimizedCost(), which returns 10783. X2
> then returns 10783 to OI_0.
> -- OI_0: calls X1.optimizeIt()
> -- X1: *fetches* OI_1 and makes calls to optimize it.
> -- OI_1: picks join order {T1} and calls T1.optimizeIt()
> -- T1: returns a cost of *1 MILLION!*.
> -- OI_1: rejects new cost (1 mil > 20) and does nothing.
> -- X1: calls OI_1.getOptimizedCost(), which returns *20*. X1
> then returns 20 to OI_0...this seems WRONG!
> -- OI_0: saves 10783 + 20 = 10803 as new best cost and tells
> X2 to save 10783 and X1 to save 20.
> So in the end, the outer-most OptimizerImpl chooses join order {X2, X1}
> because it thought the cost of this join order was only 10783, which is
> better than 64720. However, the _actual_ cost of the join order was really
> estimated at 1 million--so the outer OptimizerImpl chose (and will generate)
> a plan that, according to the estimates, was (hugely) sub-optimal.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira