[jira] Updated: (DERBY-3926) Incorrect ORDER BY caused by index

Mike Matrigali (JIRA) Thu, 23 Apr 2009 15:14:55 -0700

     [ 
https://issues.apache.org/jira/browse/DERBY-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mike Matrigali updated DERBY-3926:
----------------------------------


I took a close look at the original query plans for queries 1 through 3, and 
believe all of the original 
query plans with no sort at the top are valid plans.  So think the current fix 
may not be valid as it is causing sort where they do not need to be.  I am new 
to this, so if my logic is wrong please point it
out.

For query 1:
1)
ij> -- one row from joining table
get cursor c as
        'select * from TENKTUP1, TENKTUP2
         where TENKTUP1.unique2 = TENKTUP2.unique2
         and TENKTUP2.unique1 = 0
         order by TENKTUP1.unique1';

Because TENKTUP2.unique1 is a unique column and the query requires 
TENKTUP2.unique1 = 0,
only 1 or 0 rows qualifies from TENKTUP2.   Because TENKTUP2.unique2 and 
TENKTUP1.unique2 are 
also both unique then TENKTUP1.unique2 = TENKTUP2.unique2 means only 1 or 0 
rows qualify
qualify from TENKTUP1.  This means the query can only return 1 or 0 rows.  
Because this is the
case there is no need to sort a 1 or 0 row result set.

My question is, is it expected that the optimizer should recognize that a one 
row result set requires 
no ordering?  

It looks like queries 3 through 7 are all variations on one row result sets.

Query 2 is different:
get cursor c as
        'select * from TENKTUP1, TENKTUP2
         where TENKTUP1.unique1 = TENKTUP2.unique1
         order by TENKTUP1.unique1, TENKTUP2.unique1';

In this case the interesting thing is that TENKTUP1.unique1 must be the same as 
TENKTUP2.unique1.  So this means that any plan that is sorted on 
TENKTUP1.unique1 OR
TENKTUP2.unique1 fulfills the order by requirement to be sorted by 
TENKTUP1.unique1, TENKTUP2.unique1.

So in the original queryplan it turns out the outermost index scan is:
TENKTUP1 using index TK1UNIQUE1  which because of the predicate is enough to 
avoid sorting.

Again the question is whether the code is meant to catch this case or did it 
get lucky?

Originally I have to admit I didn't understand how any multiple table order by 
could be a sort 
avoidance plan since we have no indexes that cover multiple tables.  I believe 
the above show
some of the cases.

Another case I have seen is an order by of A.key, B.key being satisfied by join 
on A.key, B.key when it is known A.key is a single value.  Or satisfied also by 
join on B.key, A.key where it is also known that
A is a single value like A.key = ?. 





> Incorrect ORDER BY caused by index
> ----------------------------------
>
>                 Key: DERBY-3926
>                 URL: https://issues.apache.org/jira/browse/DERBY-3926
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.1.3.3, 10.2.3.0, 10.3.3.1, 10.4.2.0
>            Reporter: Tars Joris
>            Assignee: Mamta A. Satoor
>         Attachments: derby-reproduce.zip
>
>
> I think I found a bug in Derby that is triggered by an index on a large 
> column: VARCHAR(1024). I know it  is generally not a good idea to have an 
> index on such a large column.
> I have a table (table2) with a column "value", my query orders on this column 
> but the result is not sorted. It is sorted if I remove the index on that 
> column.
> The output of the attached script is as follows (results should be ordered on 
> the middle column):
> ID                  |VALUE        |VALUE
> ----------------------------------------------
> 2147483653          |000002       |21857
> 2147483654          |000003       |21857
> 4294967297          |000001       |21857
> While I would expect:
> ID                  |VALUE        |VALUE
> ----------------------------------------------
> 4294967297          |000001       |21857
> 2147483653          |000002       |21857
> 2147483654          |000003       |21857
> This is the definition:
> CREATE TABLE table1 (id BIGINT NOT NULL, PRIMARY KEY(id));
> CREATE INDEX key1 ON table1(id);
> CREATE TABLE table2 (id BIGINT NOT NULL, name VARCHAR(40) NOT NULL, value 
> VARCHAR(1024), PRIMARY KEY(id, name));
> CREATE UNIQUE INDEX key2 ON table2(id, name);
> CREATE INDEX key3 ON table2(value);
> This is the query:
> SELECT table1.id, m0.value, m1.value
> FROM table1, table2 m0, table2 m1
> WHERE table1.id=m0.id
> AND m0.name='PageSequenceId'
> AND table1.id=m1.id
> AND m1.name='PostComponentId'
> AND m1.value='21857'
> ORDER BY m0.value;
> The bug can be reproduced by just executing the attached script with the 
> ij-tool.
> Note that the result of the query becomes correct when enough data is 
> changed. This prevented me from creating a smaller example.
> See the attached file "derby-reproduce.zip" for sysinfo, derby.log and 
> script.sql.
> Michael Segel pointed out:
> "It looks like its hitting the index ordering on id,name from table 2 and is 
> ignoring the order by clause."

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-3926) Incorrect ORDER BY caused by index

Reply via email to