Thomas Nielsen wrote:

As part of my work on DERBY-2998, the implementation of the ROW_NUMBER() window function, I'm having trouble with the optimizer and its (lack of) pushing predicates down the query tree.

[ snip ]

To add to the info I see the same kind of queryplan in the similar query where the aliased expression is exchanged with 'a*b':

SELECT * FROM (SELECT a*b AS R, T.* FROM T) AS TR WHERE R < 3;

So I'm starting to wonder if this is a general Derby problem, where restrictions on aliased expression columns are not properly optimized?

I think this is the key, yes. When I ran the above query and did some tracing, Derby correctly "categorized" the predicate as "pushable" (see Predicate.categorize(), and especially the "categorize()" method of ColumnReference) and then tried to push the predicate down into the inner SELECT query. In ProjectRestrictNode.pushExpressions() we see:


 /* If this is a PRN above a SelectNode, probably due to a
  * view or derived table which couldn't be flattened, then see
  * if we can push any of the predicates which just got pushed
  * down to our level into the SelectNode.
  */
  if (pushPList != null && (childResult instanceof SelectNode))
  {
    pushPList.pushExpressionsIntoSelect(
      (SelectNode) childResult, false);
  }

which brings us to PredicateList.pushExpressionIntoSelect(...), where we find the following:

  ColumnReference ref = (ColumnReference)e.nextElement();
  if (!ref.pointsToColumnReference())
  {
    state = false;
    break;
  }

That is to say, if the ColumnReference's source does not in turn point to another column reference, then "state" will become false and the predicate will *not* be pushed. For the example of "a*b as R", the column reference "R" points to a BinaryArithmeticOperatorNode (for the expression "a*b"), which is not a ColumnReference. Thus the predicate is *not* pushed down into the SELECT.

I'm guessing the same thing happens for your ROW_NUMBER() query. Since ROW_NUMBER() is not a ColumnReference, the predicate is not pushed down into the SELECT, which means it will be enforced by the ProjectRestrictNode which sits above the SelectNode. Hence the following at the top of your query plan:

******* Project-Restrict ResultSet (4):
Number of opens = 1
Rows seen = 5
Rows filtered = 3
*restriction = true*  <== Predicate applied at the level of the PRN

So I think that explains the behavior you're seeing. As to whether or not this "incorrect", I'm not sure. Seems more likely that this is a case where support for arbitrary expressions under the ColumnReference was perhaps too complex (ex. subqueries, method calls, etc.) and so no support for that was added.

You could perhaps change "ref.pointsToColumnReference()" in the above code to account for OLAP-related functions like ROW_NUMBER(), but I do not know off-hand if that would/could lead to problems down the road. Might be worth a shot, though, to see what happens...

Does that help?

Army

Reply via email to