Re: Advice in debugging plan selection

Army Fri, 13 Jun 2008 12:03:49 -0700

Kathey Marsden wrote:

The test sets derby.optimizer.noTimeout so it should come up with thelower cost plan each time.

This is probably silly, but have you confirmed that the property is infact being set correctly? If you output the value of the "noTimeout"field from the OptimizerImpl constructor, does it return "true"?

My question is where in the code should I look for possible
problems with plan selection.

Unfortunately there's no single place where these kinds of problemsoccur. The underlying cause can be in any number of places and may notbe easily discernible...

I can't reproduce in the debugger so will need to just use println's todiagnose the problem.

If you're in a situation where you have to use printlns, you may want tostart by printing out the contents of the bestJoinOrder array inOptimizerImpl in two places: 1) at the end of rememberBestCost(), and 2)at the end of modifyAccessPaths(). The former tells you every time theOptimizerImpl thinks it has found a new "best join order" thus far inprocessing, the latter tells you what the OptimizerImpl's _final_ choiceof "best join order" was. If you print out the cost of the join orders(bestCost) as well, and then compare the results from a "fail" plan anda "pass" plan, that might give you an indication of where to start. Butas I said, there's no silver bullet in tracking that kind of thing down...

From a very quick glance at the two plans I noticed two things: 1) thecost estimates for the "fail" plan are pretty outrageous--perhapsDERBY-1905 at play here?--which means the values themselves are notreally useful for comparison; and 2) the "pass" plan uses a Hash Joinwhile the "fail" plan uses a Nested Loop join. With respect to #2, theoptimizer will deliberately skip a hash join if it does not think theJVM has enough memory to handle all of the rows. So is it possible thatin the runs where the query fails the JVM has less available memory?


In a related note, I think the comments before the query in question say:

        // In this query the optimizer will consider pushing, but
        // will find that it's cheaper to do a hash join and thus
        // will _not_ push. So we see hash join with table scan on T3.

The "fail" query plan shows that the predicate _is_ actually beingpushed to T1 and T3 (it ends up as a qualifier on T1 and as an index keyfor T3). Given the ridiculously high cost estimates of doing so, itseems like the optimizer would only have chosen to push the predicatesas a last resort--i.e. if the hash join was not possible due to memoryconstraints. At least that's what I hope based on the plans youprovided. It's also possible that for whatever reason the optimizer'sestimates for the "fail" run are somehow skewed so that the nested loopjoin is actually viewed as "cheaper". If that happens then you'reprobably looking at DERBY-1905...

To see if memory is an issue for the hash join, maybe search for"memoryUsageOK" in OptimizerImpl and print out the results of thosecalls...?


Army

Re: Advice in debugging plan selection

Reply via email to