Can someone please review this for me? thanks, Mamta
On 5/5/05, Mamta Satoor <[EMAIL PROTECTED]> wrote: > Hi, > > I have a patch for this optimizer bug. Basically, the issue turned out > to be the logic for DISTINCT elimination. During the optimization > phase, if a query has DISTINCT clause, then impl.sql.compile.FromList > class's returnsAtMostSingleRow() method gets called. This method > returns true if the method concludes that DISTINCT in the query is > redundant (based on a complex logic that decides that the query is > going to return distinct rows on its own without the DISTINCT clause. > The details of the current logic for DISTINCT elimination can be found > in the comments at the method level.) > > For the query in question in this bug, the method returned true for > DISTINCT elimination which is wrong. The explanation is as follows. > > First of all, I was able to simplify the query reported in the bug to > following query. > select distinct q1."NO1" from IDEPT q1, IDEPT q2 > where ( q2."DISCRIM_DEPT" = 'HardwareDept') > and ( q1."DISCRIM_DEPT" = 'SoftwareDept') and ( q1."NO1" <> ALL > (select q3."NO1" from IDEPT q3 where (q3."REPORTTO_NO" = q2."NO1"))) > > This query gets converted to following during optimization > select distinct q1."NO1" from IDEPT q1, IDEPT q2 > where ( q2."DISCRIM_DEPT" = 'HardwareDept') > and ( q1."DISCRIM_DEPT" = 'SoftwareDept') and not exists ( > (select q3."NO1" from IDEPT q3 where > ( q3."REPORTTO_NO" = q2."NO1" and q3."NO1" = q1."NO1") ) ) ; > > This optimized query has 4 predicates associated with it > q3.reportto_no = q2.no1 > q2.discrim_dept = 'HardwareDept' > q1.descrim_dept = 'SoftwareDept' > q1.no1 = q3.no1 > > Next, on this optimized query(since it has DISTINCT clause in it), the > returnsAtMostSingleRow() method gets called. The method incorrectly > returns true indicating that DISTINCT can be eliminated. The reason > for this is that method is looking at predicates that belong to the > inside query with the exists clause (which is on table IDEPT q3) to > determine DISTINCT elimination for the outer level. > > The fix is that the predicates from the exists query, (in this > particular case, q3."NO1" = q1."NO1" and q3.reportto_no = q2.no1) > should not be considered when deciding elimination of DISTINCT in the > outer query. That is what the attached patch does. > > Hope this helps understand the problem and the proposed fix for it. > The files impacted by the change are as follows > svn stat > M java\engine\org\apache\derby\impl\sql\compile\FromList.java > M > java\testing\org\apache\derbyTesting\functionTests\tests\lang\distinctElimination.sql > M > java\testing\org\apache\derbyTesting\functionTests\master\distinctElimination.out > > Please send in comments you may have. I have run the existing tests > and the patch didn't cause any failures. > > thanks, > Mamta > > >
