Consider creating a bitmapped index on the fields you expect to be queried the 
most. Bitmaps may be stored and manipulated (combined by logical operations) 
very efficiently as long as the field values conform to certain criteria 
(usually "managable cardinality of distinct values").

Bitmap indices are stored per column as opposed to regular database data which 
is stored per row. So a bitmap index on a field (a) with 4 discrete values 
(say, NULL, 1, 7 and 31) would be stored as 4 bitmaps (corresponding to the 
clauses "IS NULL", "= 1", "= 7" and "=31"). Or a bitmap index for defined 
keywords in a text (t) is stored as 1 bitmap for each keyword (corresponding to 
"LIKE '%key%'" for each value of key).

The combination of clauses (e.g. "a NOT NULL AND t LIKE '%key%' ") is 
blindingly fast, as the result can be determined for 32 (or even 64) rows at a 
time by simple bit arithmetics. And often even faster for bitmaps that compress 
sequences of "all 1" and "all 0" (because "all 1" AND A = A, "all 0" AND A = 
"all 0",  "all 0" OR B = B and "all 1" OR B = "all 1")

Bitmapped indices are best suited for large collections of mostly static data 
that are acquired once, rarely changed and queried a lot (there is a "fastbit" 
package developed specifically for searching through high energy physics 
experiment results).

-----Urspr?ngliche Nachricht-----
Von: Scott Robison [mailto:scott at casaderobison.com]
Gesendet: Montag, 13. Juli 2015 07:53
An: sqlite-users at mailinglists.sqlite.org
Betreff: Re: [sqlite] Suggestions for Fast Set Logic?

On Sun, Jul 12, 2015 at 9:46 AM, James K. Lowden <jklowden at schemamania.org>
wrote:

> On Sat, 11 Jul 2015 19:02:59 -0600
> Scott Robison <scott at casaderobison.com> wrote:
>
> > > I don't follow you.  A complex query is an assemblage of clauses.
> > > Whether or not broken down "to individual sub queries", the search
> > > arguments are the same.  They are few in kind, even though they may
> > > be large in number.  The analysis for what indexes to create will
> > > be the same either way.  In fact, Keith already proposed a
> > > plausible design.
> >
> > I'm just saying that knowing which index to create in advance is
> > impossible if the user can specify arbitrarily complex where clauses.
>
> In that case, I can make your job easier, starting today.  You don't
> need to consider query complexity to design your indexes.  :-)
>

{snipped}

You're making someone else's job easier (the OP), not mine. I'm certain I
could (if I cared to spend the time) find a case where the OP scenario
could be more efficiently performed with a set of individual queries that
are only filtered at the end. I don't know how large the data set would
need to be, or how many columns it would require, or how complex the where
clause would be. Just saying that it (almost certainly) exists. Also saying
that the OP scenario of individual queries isn't absolutely a *bad* way to
use SQLite, but there are many ways to skin this particular cat.

Note that *I* am not trying to provide this type of functionality. I'm
perfectly content, when using SQLite, to craft specific queries to get the
job done that can rely on one (or at most a few) specific CREATE INDEX
statements. Often the size of data I'm dealing with is small enough that
the difference between full scan and indexed access is irrelevant.

In any case, certainly the OP has been given plenty to think about and can
probably come up with an efficient way of using SQLite for this particular
use case. Perhaps some of the other suggestions have put the problem in a
different light, in which case monolithic queries can be constructed that
are sufficiently performant.

--
Scott Robison
_______________________________________________
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: hick at scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.


Reply via email to