Thanks, Mike. I have entered DERBY-890 to cover common sub expression optimization. This optimization was briefly discussed couple of times in DerbyDev earlier, I think.
Current implementation of min/max optimization works only if there is just one aggregate reference in the select and a suitable index on that column. (Like select min(i) from BIG_TABLE) It is not smart enough to handle even select min(i), min(i) from BIG_TABLE. It may be possible to enhance logic in GroupByNode to at least cover a few more cases, but not sure how common this kind of usage is. We might as well go for more generic common expression evaluation, which would benifit many more queries, including this one. Satheesh Mike Matrigali wrote: > I've gone ahead and filed DERBY-886 to describe what is needed in > the min/max case. Feel free to edit the issue if I have said anything > wrong, or unclear. I'll leave the common sub-expression stuff to > someone else. > > Satheesh Bandaram wrote: > >> >> Harri Pesonen wrote: >> >> >>> Thanks for you comments and for creating a JIRA item. I think that the >>> first case is a separate thing, because the following is slow as well: >>> >>> SELECT MIN(Id), MIN(Id) + 1 FROM Customer >>> >>> This case could be optimized to do only one MIN-traversal. I think >>> that this is a more fundamental thing, perhaps requiring more thought, >>> and some kind of global optimizations. >> >> >> >> This goes into another optimization that Derby could use... Common >> expression evaluation, which would evaluate common parts of expressions >> only once. Yet another enhancement request? :-) >> >> For your original problem with min() and max() in one query, you may be >> able to use the following query as a work-around. >> >> Select min(myid), max(myid) from (select min(id) from Customer union all >> select max(id) from Customer) myTab(myid); >> >> Something like this could also be used for this new query... making >> subquery evaluate the min() once and outer query could select both min() >> and min()+1 output. Instead of Derby recognizing this common expression >> evaluation, you are doing it by hand. Yes, I agree, Derby should be able >> to do this... reasonably easily. >> >> Satheesh >> >> >>> Sorry for the format of this message, I am using the digest. >>> >>> -- >>> Harri >>> >>> Jeffrey Lichtman wrote: >>> >>> >>>>> 1) "SELECT MIN(Id) FROM Customer" and "SELECT MAX(Id) FROM Customer" >>>>> are both fast, but "SELECT MIN(Id), MAX(Id) FROM Customer" is slow, >>>>> taking 5 seconds. Why? >>>> >>>> >>>> >>>> >>>> Derby knows how to use an index to quickly find a minimum or a >>>> maximum (by traversing down one side of the B-tree or the other). It >>>> doesn't know how to do both in the same query, which would take two >>>> traversals. >>> >>> >>> >>> Is the work to fix this the same as making IN list use multiple probes, >>> and/or makeing OR lists do multiple probes? There are existing JIRA >>> items for those, or is it different enough to have a separate JIRA? >>> >>> >>>>> 2) "SELECT * FROM Customer ORDER BY Id" is fast but "SELECT * FROM >>>>> Customer ORDER BY Id DESC" is slow. Why? Can't it scroll the index >>>>> backwards? >>>> >>>> >>> The structure could support this, the work just has not been done. >>> With >>> row level locking, concurrent tree splitting, and current assumptions >>> throughout the access method that scans go top/down, left to right the >>> work to do a backward scan is harder than doing a forward scan. Also >>> once the store portion is done, there would be necessary changes to: >>> optimizer, execution engine, and scan interface to allow the backward >>> scan. This would be a hard first project. >>> >>> The current workaround is that you can create both an ascending and >>> a descending index if you need ordering in both directions to go fast. >>> >>> You should check out the access/btree white papers Dibyendo (sp?) >>> produced, on the web site. >>> >>> I will go ahead and log an enhancement in JIRA (DERBY-884) >>> >>> >>> >> >> >> >> > > >
