[ http://issues.apache.org/jira/browse/DERBY-1365?page=all ]
     
A B closed DERBY-1365:
----------------------


> Address potential problems with optimizer logic in some rarely-exercised code.
> ------------------------------------------------------------------------------
>
>          Key: DERBY-1365
>          URL: http://issues.apache.org/jira/browse/DERBY-1365
>      Project: Derby
>         Type: Bug

>   Components: Performance
>     Versions: 10.1.2.4, 10.2.0.0, 10.1.3.0
>     Reporter: A B
>     Assignee: A B
>     Priority: Minor
>      Fix For: 10.2.0.0, 10.1.3.0, 10.1.2.5
>  Attachments: d1365_v1.patch, d1365_v1.stat
>
> While looking at the optimization code in Derby for some other work I'm 
> doing, I've noticed a couple of small pieces of code that could potentially 
> lead to failures for some queries.  Thus far I have been unable to come up 
> with any examples to demonstrate these problems with the current codeline 
> (hence the term "potential" problems) because the relevant lines of code are 
> hard to exercise--they require large, complex databases and/or some tricky 
> timing scenarios that I have thus far been unable to produce in the context 
> of the test harness.  And in fact, a look at the code coverage results for 
> the pieces of code shows that neither is actually exercised by the current 
> derbyall suite--which I believe is more indicative of how hard it is to 
> exercise the code in question than it is of incomplete/lacking tests.
> All of that said, analysis of the relevant code does indeed seem to indicate 
> that some (minor) changes are in order.  In particular, consider the 
> following two potential problems.
> 1) Potential logic error when determining the best join order for a subquery 
> that has more than one FROM table.
> This particular issue is very timing-sensitive.  It will only occur if there 
> is an outer query which has a subquery and the optimization phase of the 
> subquery "times out" in the middle of costing a join order.  The relevant 
> point in the code can be found in OptimizerImpl.java, around line 420:
>     if (permuteState != JUMPING)
>     {
>         // By setting firstLookOrder to our target join order
>         // and then setting our permuteState to JUMPING, we'll
>         // jump to the target join order and get the cost.  That
>         // cost will then be saved as bestCost, allowing us to
>         // proceed with normal timeout logic.
>         for (int i = 0; i < numOptimizables; i++)
>             firstLookOrder[i] = bestJoinOrder[i];
>         permuteState = JUMPING;
>         // If we were in the middle of a join order when this
>         // happened, then reset the join order before jumping.
>         if (joinPosition > 0)
>             rewindJoinOrder();
>     }
> The problem occurs at the last part of this "if" block, with the call to 
> rewind the join order.  The rewindJoinOrder() method will "pull" each 
> optimizable that has an assigned position in the current join order and, for 
> each one, decrement joinPosition--until all optimizables have been pulled and 
> joinPosition has a value of "0".  So far so good.
> The trick is that, unlike the other calls to rewindJoinOrder() in 
> OptimizerImpl, this particular call occurs *before* the joinPosition variable 
> is incremented (it's incremented every time a call to find the "next 
> permutation" is made, until all optimizables (FROM tables) have been assigned 
> a position in the join order).  What this means is that, with the code as it 
> currently stands, the call to rewindJoinOrder() will put joinPosition at 0, 
> but shortly thereafter joinPosition will be incremented to "1".  The 
> subsequent search for a "next permutation" will then try to place an 
> optimizable at position 1 in the join order--but since there would be no 
> optimizable at position 0 (we would have inadvertently skipped over it 
> because of the increment after rewinding), the logic for finding/setting the 
> next optimizable in the join order would break down.  So I think this needs 
> to be addressed--and it should be as simple as setting joinPosition to -1 
> after the aforementioned call to rewindJoinOrder().  This -1 value is what 
> joinPosition is set to before each round of optimization, so there is a 
> precedent and, I believe, that is the right thing to do.  Once that's done 
> all of the existing logic will work as normal.
> 2) Potential NullPointerException if optimizer comes up with unreasonably 
> high cost estimates (esp. Double.POSITIVE_INFINITY) and then, at execution 
> time, Derby tries to do a hash join based on such an estimate with a 
> ResultSet that has no rows.
> In this case, the code in question is in the constructor logic for 
> BackingStoreHashtable.java.  In cases where the backing hash table receives 
> an unreasonably high cost estimate (a "red flag" estimate as noted in the 
> comments), it's possible that, for cases where the row_source field of the 
> object is null or has no rows (which indicates an empty result set), the 
> hash_table field will end up being null when we reach the end of the 
> constructor.  But if we create a BackingStoreHashtable with a null 
> hash_table, then any calls to the BackingStoreHashtable's methods that expect 
> a valid (non-null) hash_table could fail with an NPE.  Thus there should be 
> some additional logic to ensure that, upon completion of the constructor 
> code, hash_table has been initialized to some appropriate non-null 
> value--namely, an empy hash table if there are no rows to process.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to