Educated guess, but I think an index scan will not be used because the predicate provides no "stop condition". See "What's Optimizable?" here:
http://db.apache.org/derby/docs/10.0/manuals/tuning/perf43.html *Possibly* can you force the use of an index by overriding the optimizer: http://db.apache.org/derby/docs/10.5/tuning/ctunoptimzoverride.html >From the first document: "Sometimes a table scan is the most efficient way to access data, even if a potentially useful index is available. For example, if the statement returns virtually all the data in the table, it is more efficient to go straight to the table instead of looking values up in an index, because then Derby is able to avoid the intermediate step of retrieving the rows from the index lookup values." The trouble is, because your query is open-ended (using > with no <), Derby doesn't know how many rows will be returned ... maybe it's the entire table ... in which case a table-scan is more efficient. You *might* also try using ROW_NUMBER() to limit the query and maybe trick the optimizer (search the FAQ for LIMIT), but somehow I don't think the optimizer is that smart. Any reason not to cache the result of the first query, and manually step through the results rather than constantly re-running a query to get the "next" row? -Brett On Tue, Aug 4, 2009 at 3:30 PM, Daniel Noll <[email protected]> wrote: > Hi all. > > I have a table and index like this: > > CREATE TABLE Terms ( > id INTEGER PRIMARY KEY GENERATED ALWAYS AS IDENTITY NOT NULL, > field VARCHAR(255) NOT NULL, > text VARCHAR(255) NOT NULL, > docfreq INTEGER NOT NULL > ) > > CREATE INDEX TermsFieldAndText ON Terms (field, text) > > Here's some example data. Table is shown in sorted order as it would > be listed in the index above: > > | field | text | docfreq | > ---------------------------------------- > | body | hello | 10 | > | tag | home | 100 | > | tag | work | 80 | > > For a given input term (let's say tag:home) I want to find the _next_ > entry in the table (so it should return tag:work). My naive > implementation does this: > > SELECT field, text, docfreq > FROM Terms > WHERE (field = ? AND text > ?) OR field > ? > ORDER BY field ASC, text ASC > > This does the right thing as far as the output is concerned, but Derby > does a table scan. Looking at the query, my impression would be that > in the worse case scenario, Derby should do two index scans. Or, if > it's smart, one index scan. > > I'm aware that I might be able to work around this by having instead a > field with the concatenation of both fields with > CONCAT(field,'\0',text). But I wanted to check first whether there is > a way I can maintain my relatively understandable table structure > before resorting to hacks. Basically I just need the data in index > order, starting from a given point... so I figure there is a simple > solution. > > Daniel > > > -- > Daniel Noll Forensic and eDiscovery Software > Senior Developer The world's most advanced > Nuix email data analysis > http://nuix.com/ and eDiscovery software >
