Re: derby performance and 'order by'

Sunitha Kambhampati Fri, 16 Sep 2005 17:55:54 -0700

Scott Ogden wrote:

I have observed some interesting query performance behavior and amhoping someone here can explain.
In my scenario, it appears that an existing index is not being usedfor the ‘order by’ part of the operation and as a result theperformance of certain queries is suffering.
Can someone explain if this is supposed to be what is happening andwhy? Please see below for the specific queries and their performancecharacteristics.
Here are the particulars:

---------------------------------

create table orders(

order_id varchar(50) NOT NULL

CONSTRAINT ORDERS_PK PRIMARY KEY,

amount numeric(31,2),

time date,

inv_num varchar(50),

line_num varchar(50),

phone varchar(50),

prod_num varchar(50));

--Load a large amount of data (720,000 records) into the ‘orders’ table
--Create an index on the time column as that will be used in the‘where’ clause.
create index IX_ORDERS_TIME on orders(time);
--When I run a query against this table returning top 1,000 records,this query returns very quickly, consistently less than .010 seconds.
select * from orders

where time > '10/01/2002' and time < '11/30/2002'

order by time;
--Now run a similarly query against same table, returning the top1,000 records.
--The difference is that the results are now sorted by the primary key(‘order_id’) rather than ‘time’.
--This query returns slowly, approximately 15 seconds. Why??

select * from orders

where time > '10/01/2002' and time < '11/30/2002'

order by order_id;
--Now run a third query against the same ‘orders’ table, removing thewhere clause
--This query returns quickly, around .010 seconds.

select * from orders

order by order_id;

---------------------------------------------

If you run with derby.language.logQueryPlan=true, the actual query plansused for the following queries will be written to derby.log. This willshow what indexes was used by the optimizer. Also seehttp://db.apache.org/derby/docs/10.1/tuning/rtunproper43414.html .

Query with 'order by' will require sorting. Usually, sorting requires anextra step to put the data into the right order. This extra step can beavoided for data that are already in the right order. For example, if asingle-table query has an ORDER BY on a single column, and there is anindex on that column, sorting can be avoided if Derby uses the index asthe access path.

I think in case of your first and third query the optimizer will pickthe available index thus probably avoiding requiring the sort step.

Your second query involves more work than the first query, since it hasa search condition on time, and an order by order_id. Thus if theoptimizer picks the index on time, that will involve a sort step onorder_id.

____________

Thanks,
Sunitha.

Re: derby performance and 'order by'

Reply via email to