Rick Hillegas <[EMAIL PROTECTED]> writes: > Hi Knut, > > Thanks for the clarification. Sounds like we agree that the bug is in > the query rewriting rather than in the pushdown to the Store. Methinks > the DISTINCT should not be pushed into the subselect if the inner and > outer SELECT lists aren't identical.
Yes, I suppose the optimizer could check that the ResultColumnList of the two queries are identical. Then we would have the following criteria for pushing the DISTINCT: 1) The query has a single table/subquery in the FROM clause. 2) The query and the subquery have identical column lists. 3) No predicates in the top-level query (this could possibly be relaxed). 4) The duplicate elimination can't be merged with an ORDERED BY. > Here's another possible scenario where you might push a DISTINCT from > an outer into an inner query. It all depends on when the optimizer > decides to eliminate the DISTINCT altogether: > > SELECT DISTINCT maxChildAge, employeeID FROM > ( > SELECT max( childAge ) maxChildAge, employeeID > FROM children, employees > WHERE children.parentID=employeeID > GROUP BY employeeID > ) > > Here the optimizer might reasonably do something like this: > > 1) Push the DISTINCT into the subselect. > > 2) Then notice that the results of the original subselect are already > unique because of the GROUP BY > > 3) So eliminate the DISTINCT altogether. Or we could just fix the query rewriting to flatten the query properly... ;) SELECT max( childAge ) maxChildAge, employeeID FROM children, employees WHERE children.parentID=employeeID GROUP BY employeeID Thanks for your input, Rick! Your comments have been very helpful! I have just filed another bug in the derbylang test. That bug is really a consequence of this bug. See http://issues.apache.org/jira/browse/DERBY-519 for details. -- Knut Anders