[ 
https://issues.apache.org/jira/browse/DERBY-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Matrigali updated DERBY-3367:
----------------------------------


Are you looking for Derby to optimize time to first row returned?  Does your 
app really only want the first row? Doing this
kind of costing would be interesting but not sure how we would decide which is 
better.  Maybe the row_number project
will get into costing associated with only needing to produce a subset of the 
rows of the query.

Derby definitely currently tries to cost out getting all the rows, not just the 
first one.  For these kinds of reports it would 
be interesting to report total time of the 2 different plans while getting all 
the rows.  I would also not be surprised if
current costing should be updated to reflect current software costs of the 
plans, we have not updated the base costs
for many years now.

It is surprising that the one without the qualifier goes through the index.  I 
would have thought that would be the worst case
cost for the index probes.  I think Derby will only consider a full index scan 
for this query with != -1 added as a qualifier - though
logically it could be done as 2 index scans (ie. < -1 and then > -1).  Even so 
the cost of going through all the rows and the cost of
going through all the rows and applying != -1 to the index key seems like it 
should be about the same.  
I would have guessed that both might not use the index if
costing for the sort came out smaller than 

> Sort is not avoided even when the has an index on a the column being ordered, 
>  for a query with id != -1 predicate.
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-3367
>                 URL: https://issues.apache.org/jira/browse/DERBY-3367
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.3.2.1
>            Reporter: Suresh Thalamati
>         Attachments: derby.log
>
>
> Sort is not avoided even when the has an index on a the column being ordered, 
> Repro:
> go.ddl:
> -------
> connect 'jdbc:derby:testdb;create=true';
> create table t1 (i int, j int, vc varchar(30));
> insert into t1 values (1, -1, 'minus one');
> insert into t1 values (2, 2, 'two'), (3, 3, 'trois'), (3, -3, 'minus three'), 
> (4, 4, 'four');
> insert into t1 select * from t1 where j > 0;
> insert into t1 select * from t1 where j > 0;
> insert into t1 select * from t1 where j > 0;
> insert into t1 select * from t1 where j > 0;
> insert into t1 select * from t1 where j > 0;
> insert into t1 select * from t1 where j > 0;
> insert into t1 select * from t1 where j > 0;
> insert into t1 select * from t1 where j > 0;
> insert into t1 select * from t1 where j > 0;
> insert into t1 select * from t1 where j > 0;
> create index ix on t1 (j);
> disconnect all;
> exit;
> go.sql:
> -------
> connect 'jdbc:derby:testdb';
> get cursor c1 as 'select j, vc from t1 order by j asc';
> next c1;
> close c1;
> get cursor c1 as 'select j, vc from t1 where j != -1 order by j asc';
> next c1;
> close c1;
> --
> After running "go.sql", if you look at the derby.log file you'll see that the 
> query with no predicate does an index scan and only has to read 1 row from 
> disk 
> before the cursor is closed.  But the query _with_ a predicate does a table
> scan an has to read 3074 rows from disk, and sort them, just to return 
> the first one in the result set. 
> In the repro, it looks fast. But If the data is large, 
> which was the case in my  application.  
> The table was: 
> create table t2 (i int, j int, vc varchar(15000)); 
> and loaded with 13000 rows. It takes almost minute to get the first row ,
> for the query "select j, vc from t1 where j != -1 order by j asc'"

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to