Re: A few questions about index usage

Roy Lyseng Fri, 27 Jan 2006 11:57:05 -0800


Satheesh Bandaram wrote:


Harri Pesonen wrote:

Thanks for you comments and for creating a JIRA item. I think that the
first case is a separate thing, because the following is slow as well:

SELECT MIN(Id), MIN(Id) + 1 FROM Customer

This case could be optimized to do only one MIN-traversal. I think
that this is a more fundamental thing, perhaps requiring more thought,
and some kind of global optimizations.



This goes into another optimization that Derby could use... Common
expression evaluation, which would evaluate common parts of expressions
only once. Yet another enhancement request? :-)

For your original problem with min() and max() in one query, you may be
able to use the following query as a work-around.

Select min(myid), max(myid) from (select min(id) from Customer union all
select max(id) from Customer) myTab(myid);


...or as a cross product:

SELECT MIN(A.Id), MAX(B.Id) FROM Customer A, Customer B

This might be fairly efficient, given that each branch of the crossproduct only returns a single row. I guess this could be a possibleinternal translation as well.

Roy


Something like this could also be used for this new query... making
subquery evaluate the min() once and outer query could select both min()
and min()+1 output. Instead of Derby recognizing this common expression
evaluation, you are doing it by hand. Yes, I agree, Derby should be able
to do this... reasonably easily.

Satheesh

Sorry for the format of this message, I am using the digest.

--
Harri

Jeffrey Lichtman wrote:

1) "SELECT MIN(Id) FROM Customer" and "SELECT MAX(Id) FROM Customer"
are both fast, but "SELECT MIN(Id), MAX(Id) FROM Customer" is slow,
taking 5 seconds. Why?




Derby knows how to use an index to quickly find a minimum or a
maximum (by traversing down one side of the B-tree or the other). It
doesn't know how to do both in the same query, which would take two
traversals.



Is the work to fix this the same as making IN list use multiple probes,
and/or makeing OR lists do multiple probes?  There are existing JIRA
items for those, or is it different enough to have a separate JIRA?

2) "SELECT * FROM Customer ORDER BY Id" is fast but "SELECT * FROM
Customer ORDER BY Id DESC" is slow. Why? Can't it scroll the index
backwards?

The structure could support this, the work just has not been done.  With
row level locking, concurrent tree splitting, and current assumptions
throughout the access method that scans go top/down, left to right the
work to do a backward scan is harder than doing a forward scan.  Also
once the store portion is done, there would be necessary changes to:
optimizer, execution engine, and scan interface to allow the backward
scan.  This would be a hard first project.

The current workaround is that you can create both an ascending and
a descending index if you need ordering in both directions to go fast.

You should check out the access/btree white papers Dibyendo (sp?)
produced, on the web site.

I will go ahead and log an enhancement in JIRA (DERBY-884)

Re: A few questions about index usage

Reply via email to