On Wed, Aug 5, 2009 at 00:16, Bryan Pendleton<[email protected]> wrote: > > What happens if you do two separate queries: > > SELECT field, text, docfreq FROM Terms > WHERE field > ? > ORDER BY field ASC, text ASC > > SELECT field, text, docfreq FROM Terms > WHERE (field = ? AND text > ?) > ORDER BY field ASC, text ASC > > Sometimes the "OR" will really throw off the optimizer. > > Does each query, separately, use the index as you expect?
Indeed, both of these queries (even the one with only >) do an index scan instead of a table scan. I think you're right; the optimiser sees the OR and goes "oops, this condition is too hard", because it doesn't know that the two ranges are adjacent (which would be hard to detect, I'm guessing) and then proceeds to use the table scan instead. I am considering using two separate queries as a workaround. I can look in the same field and then if that returns nothing, look for the next field. As far as caching goes, yes... I want to do that, but I aimed for correctness first. I can't cache *all* entries because there might be an extremely large number, but I can at least cache some 100 or 1,000 and then avoid the next 99 queries. The generated column idea offered by Rick isn't bad either, but it will double storage requirements versus doing the concatenation manually, or versus this trick. Daniel -- Daniel Noll Forensic and eDiscovery Software Senior Developer The world's most advanced Nuix email data analysis http://nuix.com/ and eDiscovery software
