[
https://issues.apache.org/jira/browse/DERBY-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12802256#action_12802256
]
Dag H. Wanvik commented on DERBY-4397:
--------------------------------------
I have done some investigation regarding my question above "Is the
flattening a prerequisite for the optimizer to handle such cases?".
If anyone is familiar with how the optimizer works, I would appreciate
any comments, since this is pretty much unknown code to me.
It seems that information about ordering of result sets from
subqueries is lost/not used by the outer query if the subquery can not
be flattened. The following example query also gets an unnecessary
sort:
(A) select * from (select distinct i, (select count(*) from sys.systables) as j
from t) t order by i
Here the nested "select count(*)" in the select list stops flattening
from happening, cf. SelectNode#flattenableInFromSubquery ca line 1379.
So the tree structure is somewhat similar to what we had in the query
(B) select * from (select i from t offset 0 rows) t order by i
when optimization starts, so I thought it would be useful to look at
its optimization (query A):
The from subquery is optimized separately as far as I can tell (I will
ignore the SELECT subquery, I only introduced it to prevent
flattening).
The DISTINCT is optimized away since there is an index on i (in
optimization of the subquery). The ORDER BY in the outer query leads
to a sort, though. From looking at the code this is to be expected, I
think:
The check done in SelectNode calls orderByList.getSortNeeded() which
will return true unless orderByList.sortNotNeeded has been
called. This call is only made by the optimizer if it has found a
SORT_AVOIDANCE_PLAN. The criterion for finding a sort avoidance plan
hinges on the call to requiredRowOrdering.sortRequired
(i.e. OrderByList.sortRequired) in
OptimizerImpl.getNextDecoratedPermutation ca line 1799. If
sortRequired does not return RequiredRowOrdering.NOTHING_REQUIRED, a
sort will be necessary. That check fails to detect that nothing is
required since the referenced columns table number is not in the table
map provided. I think this is since the column stems from the
subquery.
Note that if the an "order by i" is added to the inner query, this
will be handled correctly and optimized away, but the outer ORDER BY
still leads so sorting.
In sum, it seems to me that the information about any row ordering in
the unflattened subquery is not transmitted up to the optimization of
the outer query. If this conclusion is correct, we would need a new
mechanism to convey the missing information to the upper optimizer in
the presence of unflattened "from subqueries".
> Allow ORDER BY in subqueries
> ----------------------------
>
> Key: DERBY-4397
> URL: https://issues.apache.org/jira/browse/DERBY-4397
> Project: Derby
> Issue Type: Improvement
> Components: SQL
> Reporter: Dag H. Wanvik
> Assignee: Dag H. Wanvik
> Attachments: derby-4397-1.diff, derby-4397-1.stat, derby-4397-2.diff,
> derby-4397-2.stat, derby-4397-all-subqueries.diff,
> derby-4397-all-subqueries.stat, derby-4397-insert-from-exists.diff,
> derby-4397-insert-from-exists.stat, derby-4397-sortavoidance-a.diff,
> derby-4397-sortavoidance-a.stat, orderBySpec.html, orderBySpec.html,
> orderBySpec.html, orderBySpec.html, orderBySpec.html
>
>
> SQL 2008 allows ORDER BY to be specified in subqueries. In conjunction with
> OFFSET/FETCH and/or ROW_NUMBER
> meaningful subqueries with row ordering may be formulated. Cf. MySQL's LIMIT
> may be used in subqueries as well.
> Note that OFFSET/FETCH is currently not allowed in subqueries, either.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.