[jira] Closed: (DERBY-251) DISTINCT query is returning duplicate rows

Mamta A. Satoor (JIRA) Thu, 12 May 2005 06:10:12 -0700

     [ http://issues.apache.org/jira/browse/DERBY-251?page=all ]
     
Mamta A. Satoor closed DERBY-251:
---------------------------------


     Resolution: Fixed
    Fix Version: 10.1.0.0

The above query gets transformed into following at optimization time
select  distinct  q1."NO1" from IDEPT q1, IDEPT q2
where  ( q2."DISCRIM_DEPT" = 'HardwareDept')
and  ( q1."DISCRIM_DEPT" = 'SoftwareDept')  and  not exists (
(select  q3."NO1" from IDEPT q3 where
(  q3."REPORTTO_NO" =  q2."NO1"  and q3."NO1" = q1."NO1") ) )  ;

On this query, the distinct elimination logic is called. That logic incorrectly 
used the equality condition from the not exists clause to determine elimination 
of distinct on the outer query. I have changed the logic to ignore equality 
predicates associated with the exists clause while deciding to drop distinct in 
the outer query.

This change got committed as Revision 169735.

> DISTINCT query is returning duplicate rows
> ------------------------------------------
>
>          Key: DERBY-251
>          URL: http://issues.apache.org/jira/browse/DERBY-251
>      Project: Derby
>         Type: Bug
>   Components: SQL
>     Versions: 10.1.0.0
>     Reporter: Mamta A. Satoor
>     Assignee: Mamta A. Satoor
>      Fix For: 10.1.0.0

>
> Following query on a table with primary key returns duplicate rows
> select  distinct  q1."NO1" from IDEPT q1, IDEPT q2 
> where  ( q2."DISCRIM_DEPT" = 'HardwareDept') 
>  and  ( q1."DISCRIM_DEPT" = 'SoftwareDept')  and  ( q1."NO1" 
> <> ALL  ( select  q3."NO1" from IDEPT q3 where  (  q3."REPORTTO_NO" =  
> q2."NO1") ) )  ;
> The sql script to create the table and load data into it is as follows
> CREATE TABLE "APP"."IDEPT" ("DISCRIM_DEPT" VARCHAR(32), "NO1" INTEGER NOT 
> NULL, 
> "NAME" VARCHAR(50), "AUDITOR_NO" INTEGER, "REPORTTO_NO" INTEGER, 
> "HARDWAREASSET"
>  VARCHAR(15), "SOFTWAREASSET" VARCHAR(15));
> -- primary/unique
> ALTER TABLE "APP"."IDEPT" ADD CONSTRAINT "PK_IDEPT" PRIMARY KEY ("NO1");
> insert into idept values ('Dept', 1, 'Department1', null, null, null, null);
> insert into idept values ('HardwareDept', 2, 'Department2', 25, 1, 
> 'hardwareaset2', null);
> insert into idept values ('HardwareDept', 3, 'Department3', 25, 2, 
> 'hardwareaset3', null);
> insert into idept values ('SoftwareDept', 4, 'Department4', 25, 1, null, 
> 'softwareasset4');
> insert into idept values ('SoftwareDept', 5, 'Department5', 30, 4, null, 
> 'softwareasset5');
> The problem appears to be in 
> org.apache.derby.impl.sql.compile.FromList.returnsAtMostSingleRow() method. 
> This method along with other things tries to determine if the DISTINCT can be 
> dropped without causing duplicate rows. For the query in question, this 
> method decides that DISTINCT is not necessary which I think is incorrect.
> If the table above is created with no primary key, the DISTINCT query does 
> not return duplicate rows. That is because one of the several criterias for 
> dropping DISTINCT is that there should be a unique index on the columns in 
> the where clause.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Closed: (DERBY-251) DISTINCT query is returning duplicate rows

Reply via email to