One could argue that in the first case (2>1), the optimizer is free to choose whatever index it wants because the query can provably return no rows. However, in the second case (1=1), were the DateTimestamp condition must be evaluated, it would appear that the optimizer indeed selected the wrong index. However, I don't think you can point to 2ms vs. 6000ms as proof of a bad query plan. In the first case, the optimizer can prove (to itself) that the query can return no rows (because of the 2>1 condition along with all AND predicates) -- and therefore it probably doesn't do any index scan at all. Therefore, 2ms.
In the second case, because the 1=1 condition is satisfied, Derby must *actually* do the index scan as per plan. Therefore, 6000ms. Whether using the DESC index actually provides significant improvement over ASC remains to be seen, as the comparison currently isn't apples-to-apples. I would be interested in seeing the result of the DESC vs. the ASC index in the (1=1) scenario. You can override the optimizer with a hint, like so: SELECT * FROM table --DERBY-PROPERTIES index=TBLEVENTS_DATETIMESTAMP_DESC WHERE 1=1 AND 1=1 AND DateTimestamp >= ? AND DateTimestamp <= ? AND ORDER BY DateTimestamp DESC Note, the --DERBY-PROPERTIES must come at the end of a literal line (i.e. there must be a linefeed after that cause) otherwise everything after that will be taken as a comment. Let us know the result. Either way it may be considered a bug in the optimizer, but depending on the number of rows returned, may not make a significant difference in performance. -Brett On Wed, Sep 23, 2009 at 11:18 PM, Gavin Matthews <[email protected]> wrote: > > Hi, > > We have a WHERE clause: > > WHERE > 2>1 AND > 2>1 AND > DateTimestamp >= ? AND > DateTimestamp <= ? AND > ORDER BY DateTimestamp DESC > > which reasonably takes 2ms to execute on a table that has 357254 rows, and > does so by selecting the most appropriate index: > > Index Scan ResultSet for TBLEVENTS using index TBLEVENTS_DATETIMESTAMP_DESC > at serializable isolation level using share row locking chosen by the > optimizer > > However, when we change the WHERE clause to: > > WHERE > 1=1 AND > 1=1 AND > DateTimestamp >= ? AND > DateTimestamp <= ? AND > ORDER BY DateTimestamp DESC > > the execution time increases to 6000ms, and appears to do so because the > query planner hasn't selected the most appropriate index: > > Index Scan ResultSet for TBLEVENTS using index TBLEVENTS_DATETIMESTAMP_ASC > at serializable isolation level using share row locking chosen by the > optimizer > > The indexes are specified as follows: > > CREATE INDEX tblEventsDateTimestampASC ON tblEvents (DateTimestamp ASC); > CREATE INDEX tblEventsDateTimestampDESC ON tblEvents (DateTimestamp DESC); > > Do people agree that this appears to be a bug with the query planner, as > the > selection of the most appropriate index should not be influenced by static > conditions such as 1=1 or 2>1? > > Regards, > > Gavin > > -- > View this message in context: > http://www.nabble.com/Badly-performing-WHERE-caluse-tp25531166p25531166.html > Sent from the Apache Derby Users mailing list archive at Nabble.com. > >
