Thanks all for the interesting discussion, you've given me a lot to think about.
I've worked with SQLite for 10+ years, but I still consider myself something of newbie compared to the folks here, which is why I asked. One thing is clear -- our lack of routine calls to ANALYZE might be a huge reason I've seen sub-par query times in the past. I should really try to rectify that :) And as far as doing separate queries and combing later, or doing one giant WHERE clause, I think there is enough discussion here that I might try both approaches. It does get a little more complicated (I tried to keep things simple in my question) because the actual queries involve JOINS and as such figuring out how to put the query together gets a bit tricky. I'll let you know how it goes. _ Andy On Mon, Jul 13, 2015 at 2:26 AM, Hick Gunter <hick at scigames.at> wrote: > Consider creating a bitmapped index on the fields you expect to be queried > the most. Bitmaps may be stored and manipulated (combined by logical > operations) very efficiently as long as the field values conform to certain > criteria (usually "managable cardinality of distinct values"). > > Bitmap indices are stored per column as opposed to regular database data > which is stored per row. So a bitmap index on a field (a) with 4 discrete > values (say, NULL, 1, 7 and 31) would be stored as 4 bitmaps (corresponding > to the clauses "IS NULL", "= 1", "= 7" and "=31"). Or a bitmap index for > defined keywords in a text (t) is stored as 1 bitmap for each keyword > (corresponding to "LIKE '%key%'" for each value of key). > > The combination of clauses (e.g. "a NOT NULL AND t LIKE '%key%' ") is > blindingly fast, as the result can be determined for 32 (or even 64) rows > at a time by simple bit arithmetics. And often even faster for bitmaps that > compress sequences of "all 1" and "all 0" (because "all 1" AND A = A, "all > 0" AND A = "all 0", "all 0" OR B = B and "all 1" OR B = "all 1") > > Bitmapped indices are best suited for large collections of mostly static > data that are acquired once, rarely changed and queried a lot (there is a > "fastbit" package developed specifically for searching through high energy > physics experiment results). > > -----Urspr?ngliche Nachricht----- > Von: Scott Robison [mailto:scott at casaderobison.com] > Gesendet: Montag, 13. Juli 2015 07:53 > An: sqlite-users at mailinglists.sqlite.org > Betreff: Re: [sqlite] Suggestions for Fast Set Logic? > > On Sun, Jul 12, 2015 at 9:46 AM, James K. Lowden <jklowden at schemamania.org > > > wrote: > > > On Sat, 11 Jul 2015 19:02:59 -0600 > > Scott Robison <scott at casaderobison.com> wrote: > > > > > > I don't follow you. A complex query is an assemblage of clauses. > > > > Whether or not broken down "to individual sub queries", the search > > > > arguments are the same. They are few in kind, even though they may > > > > be large in number. The analysis for what indexes to create will > > > > be the same either way. In fact, Keith already proposed a > > > > plausible design. > > > > > > I'm just saying that knowing which index to create in advance is > > > impossible if the user can specify arbitrarily complex where clauses. > > > > In that case, I can make your job easier, starting today. You don't > > need to consider query complexity to design your indexes. :-) > > > > {snipped} > > You're making someone else's job easier (the OP), not mine. I'm certain I > could (if I cared to spend the time) find a case where the OP scenario > could be more efficiently performed with a set of individual queries that > are only filtered at the end. I don't know how large the data set would > need to be, or how many columns it would require, or how complex the where > clause would be. Just saying that it (almost certainly) exists. Also saying > that the OP scenario of individual queries isn't absolutely a *bad* way to > use SQLite, but there are many ways to skin this particular cat. > > Note that *I* am not trying to provide this type of functionality. I'm > perfectly content, when using SQLite, to craft specific queries to get the > job done that can rely on one (or at most a few) specific CREATE INDEX > statements. Often the size of data I'm dealing with is small enough that > the difference between full scan and indexed access is irrelevant. > > In any case, certainly the OP has been given plenty to think about and can > probably come up with an efficient way of using SQLite for this particular > use case. Perhaps some of the other suggestions have put the problem in a > different light, in which case monolithic queries can be constructed that > are sufficiently performant. > > -- > Scott Robison > _______________________________________________ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > > > ___________________________________________ > Gunter Hick > Software Engineer > Scientific Games International GmbH > FN 157284 a, HG Wien > Klitschgasse 2-4, A-1130 Vienna, Austria > Tel: +43 1 80100 0 > E-Mail: hick at scigames.at > > This communication (including any attachments) is intended for the use of > the intended recipient(s) only and may contain information that is > confidential, privileged or legally protected. Any unauthorized use or > dissemination of this communication is strictly prohibited. If you have > received this communication in error, please immediately notify the sender > by return e-mail message and delete all copies of the original > communication. Thank you for your cooperation. > > > _______________________________________________ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users >