[ http://issues.apache.org/jira/browse/DERBY-2130?page=comments#action_12455141 ] A B commented on DERBY-2130: ----------------------------
Everything written in Bryan's preceding comment sounds correct to me, so I won't dwell. As for the specific questions: > I've seen almost no discussion of the first point (that the cheapest actual > plan > should have the cheapest estimated cost); does this mean that we are pretty > confident about this aspect of cost estimation at this point? That is, the > cost > estimating may be off, but it seems to be off for all queries equally? This has been my general assumption about the code, yes--at least, after DERBY-1007 was resolved. It seems to me that at some point in the last year I found a scenario where the optimizer's "best" cost estimate did not (appear to) correspond to the best query plan, but I don't remember the details and it may have ended up being correct after all. In any event, in all of the discussion that I've had/written, my general assumption has been that "Yes", the *relative* accuracy of the cost estimates is correct--i.e. that better plans have lower cost estimates. Note, though, that this is just an assumption of mine which I have not bothered trying to debunk; if you find info to the contrary, please say so! > DERBY-1907 is certainly relevant here; are there other issues logged like > this? None come to mind, no. But I admit that's an answer based strictly on memory; I didn't actually do any searching... > I guess I'm wondering (out loud) whether it is worth investigating a simple > tuning > of the cost estimation algorithm. If the optimizer was *much* faster at > generating > and estimating possible plans, wouldn't that be a big benefit? Yes, definitely! > Also, how confident are we that permutation jumping (as described in > http://wiki.apache.org/db-derby/JoinOrderPermutations) is working properly? I had to laugh out loud when I read this question. It sounds to me like the kind of question someone asks when they've found a somewhat serious bug but don't want to rock the boat ;) So far as I know, the jumping code is working properly. But if you told me there was a problem with the code, I think I'd assume you were right. Is that the case? > Optimizer performance slowdown from 10.1 to 10.2 > ------------------------------------------------ > > Key: DERBY-2130 > URL: http://issues.apache.org/jira/browse/DERBY-2130 > Project: Derby > Issue Type: Bug > Components: Performance, SQL > Affects Versions: 10.2.1.6, 10.3.0.0, 10.1.3.1 > Reporter: Bryan Pendleton > Attachments: repro.sql > > > Attached is 'repro.sql', an IJ script which demonstrates what I > believe to be a serious performance issue in the Optimizer. > I have run this script in a number of configurations: > - 10.1.2.1: the script runs successfully. The 'prepare' statement > takes about 90 seconds, on a fairly powerful Windows machine > - 10.1.3.1: the script produces a NPE. I believe this is DERBY-1777 > - 10.2.1.8/trunk: the script runs successfully. The 'prepare' statement > often takes about 220 seconds, on the same Windows machine > Intermittently, on 10.2 and on the trunk, the prepare statement takes > 15+ minutes. I cannot reliably reproduce this; I run the same script > several times in a row and I cannot predict whether it will take 220 > seconds or whether it will take 15+ minutes. > I am quite motivated to work on this problem, as this is blocking me from > using Derby for a project that I'm quite keen on, but I need some > suggestions and ideas about how to attack it. From my perspective > there are 3 primary topics: > 1) Why did optimizer performance for this query degrade so significantly > from 10.1.2.1 to 10.2? The optimizer seems to be at least 2.5 times slower, > for this particular query at least, in 10.2. Sometimes it is 10x slower. > 2) What is the source of the non-determinism? Why does the optimizer > often take 4 minutes to optimize this query on the trunk, but sometimes > take 15+ minutes? I don't believe that I'm changing anything from > run to run. > 3) Can we improve the optimizer performance even beyond what it was > for 10.1.2? I realize that this is an ugly query, but I was hoping to > see an optimization time of 5-10 seconds, not 90 seconds (and certainly > not 220 seconds). > I have attempted to start answering some of these questions, with > limited success. Here is some of what I think I've discovered so far: > - the optimizer changes in 10.2 seem to have given the optimizer many > more choices of possible query plans to consider. I think this means > that, if the optimizer does not time out, it will spend substantially > more time optimizing because there are more choices to evaluate. Does > this by itself mean that the optimizer will take 2.5 times longer in > 10.2 than it did in 10.1? > - something about this query seems to make the costing mechanism go > haywire, and produce extreme costs. While stepping through the > optimization of this query in the debugger I have seen it compute > costs like 1e63 and 1e200. This might be very closely related to > DERBY-1905, although I don't think I'm doing any subqueries here. > But maybe I'm misunderstanding the term "subquery" in DERBY-1905. > At any rate, due to the enormous estimated costs, timeout does not > occur. > - the WHERE clause in this query is converted during compilation to > an equivalent IN clause, I believe, which then causes me to run into > a number of the problems described in DERBY-47 and DERBY-713. > Specifically, rather than constructing a plan which involves 4 > index probes for the 4 WHERE clause values, the optimizer decides > that an index scan must be performed and that it will have to process > the entire index (because the query uses parameter markers, not > literal values). So perhaps solving DERBY-47 would help me > - the optimizer in fact comes up with a "decent" query plan quite quickly. > I have experimented with placing a hard limit into the optimizer > timeout code, so that I can force optimization to stop after an > arbitrary fixed period of time. Then I have been able to set that > value to as low as 1 second, and the optimizer has produced plans > that then execute in a few milliseconds. Of course, I have only tried > this with a trivial amount of data in my database, so it's possible > that the plan produced by the optimizer after just a second of > optimizing is in fact poor, and I'm just not noticing it because my > data sizes are so small. > At this point, what would be really helpful to me would be some suggestions > about some general approaches or techniques to try to start breaking down > and analyzing this problem. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
