Consider creating a bitmapped index on the fields you expect to be queried the most. Bitmaps may be stored and manipulated (combined by logical operations) very efficiently as long as the field values conform to certain criteria (usually "managable cardinality of distinct values").
Bitmap indices are stored per column as opposed to regular database data which is stored per row. So a bitmap index on a field (a) with 4 discrete values (say, NULL, 1, 7 and 31) would be stored as 4 bitmaps (corresponding to the clauses "IS NULL", "= 1", "= 7" and "=31"). Or a bitmap index for defined keywords in a text (t) is stored as 1 bitmap for each keyword (corresponding to "LIKE '%key%'" for each value of key). The combination of clauses (e.g. "a NOT NULL AND t LIKE '%key%' ") is blindingly fast, as the result can be determined for 32 (or even 64) rows at a time by simple bit arithmetics. And often even faster for bitmaps that compress sequences of "all 1" and "all 0" (because "all 1" AND A = A, "all 0" AND A = "all 0", "all 0" OR B = B and "all 1" OR B = "all 1") Bitmapped indices are best suited for large collections of mostly static data that are acquired once, rarely changed and queried a lot (there is a "fastbit" package developed specifically for searching through high energy physics experiment results). -----Urspr?ngliche Nachricht----- Von: Scott Robison [mailto:scott at casaderobison.com] Gesendet: Montag, 13. Juli 2015 07:53 An: sqlite-users at mailinglists.sqlite.org Betreff: Re: [sqlite] Suggestions for Fast Set Logic? On Sun, Jul 12, 2015 at 9:46 AM, James K. Lowden <jklowden at schemamania.org> wrote: > On Sat, 11 Jul 2015 19:02:59 -0600 > Scott Robison <scott at casaderobison.com> wrote: > > > > I don't follow you. A complex query is an assemblage of clauses. > > > Whether or not broken down "to individual sub queries", the search > > > arguments are the same. They are few in kind, even though they may > > > be large in number. The analysis for what indexes to create will > > > be the same either way. In fact, Keith already proposed a > > > plausible design. > > > > I'm just saying that knowing which index to create in advance is > > impossible if the user can specify arbitrarily complex where clauses. > > In that case, I can make your job easier, starting today. You don't > need to consider query complexity to design your indexes. :-) > {snipped} You're making someone else's job easier (the OP), not mine. I'm certain I could (if I cared to spend the time) find a case where the OP scenario could be more efficiently performed with a set of individual queries that are only filtered at the end. I don't know how large the data set would need to be, or how many columns it would require, or how complex the where clause would be. Just saying that it (almost certainly) exists. Also saying that the OP scenario of individual queries isn't absolutely a *bad* way to use SQLite, but there are many ways to skin this particular cat. Note that *I* am not trying to provide this type of functionality. I'm perfectly content, when using SQLite, to craft specific queries to get the job done that can rely on one (or at most a few) specific CREATE INDEX statements. Often the size of data I'm dealing with is small enough that the difference between full scan and indexed access is irrelevant. In any case, certainly the OP has been given plenty to think about and can probably come up with an efficient way of using SQLite for this particular use case. Perhaps some of the other suggestions have put the problem in a different light, in which case monolithic queries can be constructed that are sufficiently performant. -- Scott Robison _______________________________________________ sqlite-users mailing list sqlite-users at mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___________________________________________ Gunter Hick Software Engineer Scientific Games International GmbH FN 157284 a, HG Wien Klitschgasse 2-4, A-1130 Vienna, Austria Tel: +43 1 80100 0 E-Mail: hick at scigames.at This communication (including any attachments) is intended for the use of the intended recipient(s) only and may contain information that is confidential, privileged or legally protected. Any unauthorized use or dissemination of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender by return e-mail message and delete all copies of the original communication. Thank you for your cooperation.