[sqlite] Suggestions for Fast Set Logic?

Andy Rahn Mon, 20 Jul 2015 17:30:30 -0500

Thanks all for the interesting discussion, you've given me a lot to think
about.


I've worked with SQLite for 10+ years, but I still consider myself
something of newbie compared to the folks here, which is why I asked.  One
thing is clear -- our lack of routine calls to ANALYZE might be a huge
reason I've seen sub-par query times in the past.  I should really try to
rectify that :)

And as far as doing separate queries and combing later, or doing one giant
WHERE clause, I think there is enough discussion here that I might try both
approaches.  It does get a little more complicated (I tried to keep things
simple in my question) because the actual queries involve JOINS and as such
figuring out how to put the query together gets a bit tricky.  I'll let you
know how it goes.


 _ Andy




On Mon, Jul 13, 2015 at 2:26 AM, Hick Gunter <hick at scigames.at> wrote:

> Consider creating a bitmapped index on the fields you expect to be queried
> the most. Bitmaps may be stored and manipulated (combined by logical
> operations) very efficiently as long as the field values conform to certain
> criteria (usually "managable cardinality of distinct values").
>
> Bitmap indices are stored per column as opposed to regular database data
> which is stored per row. So a bitmap index on a field (a) with 4 discrete
> values (say, NULL, 1, 7 and 31) would be stored as 4 bitmaps (corresponding
> to the clauses "IS NULL", "= 1", "= 7" and "=31"). Or a bitmap index for
> defined keywords in a text (t) is stored as 1 bitmap for each keyword
> (corresponding to "LIKE '%key%'" for each value of key).
>
> The combination of clauses (e.g. "a NOT NULL AND t LIKE '%key%' ") is
> blindingly fast, as the result can be determined for 32 (or even 64) rows
> at a time by simple bit arithmetics. And often even faster for bitmaps that
> compress sequences of "all 1" and "all 0" (because "all 1" AND A = A, "all
> 0" AND A = "all 0",  "all 0" OR B = B and "all 1" OR B = "all 1")
>
> Bitmapped indices are best suited for large collections of mostly static
> data that are acquired once, rarely changed and queried a lot (there is a
> "fastbit" package developed specifically for searching through high energy
> physics experiment results).
>
> -----Urspr?ngliche Nachricht-----
> Von: Scott Robison [mailto:scott at casaderobison.com]
> Gesendet: Montag, 13. Juli 2015 07:53
> An: sqlite-users at mailinglists.sqlite.org
> Betreff: Re: [sqlite] Suggestions for Fast Set Logic?
>
> On Sun, Jul 12, 2015 at 9:46 AM, James K. Lowden <jklowden at schemamania.org
> >
> wrote:
>
> > On Sat, 11 Jul 2015 19:02:59 -0600
> > Scott Robison <scott at casaderobison.com> wrote:
> >
> > > > I don't follow you.  A complex query is an assemblage of clauses.
> > > > Whether or not broken down "to individual sub queries", the search
> > > > arguments are the same.  They are few in kind, even though they may
> > > > be large in number.  The analysis for what indexes to create will
> > > > be the same either way.  In fact, Keith already proposed a
> > > > plausible design.
> > >
> > > I'm just saying that knowing which index to create in advance is
> > > impossible if the user can specify arbitrarily complex where clauses.
> >
> > In that case, I can make your job easier, starting today.  You don't
> > need to consider query complexity to design your indexes.  :-)
> >
>
> {snipped}
>
> You're making someone else's job easier (the OP), not mine. I'm certain I
> could (if I cared to spend the time) find a case where the OP scenario
> could be more efficiently performed with a set of individual queries that
> are only filtered at the end. I don't know how large the data set would
> need to be, or how many columns it would require, or how complex the where
> clause would be. Just saying that it (almost certainly) exists. Also saying
> that the OP scenario of individual queries isn't absolutely a *bad* way to
> use SQLite, but there are many ways to skin this particular cat.
>
> Note that *I* am not trying to provide this type of functionality. I'm
> perfectly content, when using SQLite, to craft specific queries to get the
> job done that can rely on one (or at most a few) specific CREATE INDEX
> statements. Often the size of data I'm dealing with is small enough that
> the difference between full scan and indexed access is irrelevant.
>
> In any case, certainly the OP has been given plenty to think about and can
> probably come up with an efficient way of using SQLite for this particular
> use case. Perhaps some of the other suggestions have put the problem in a
> different light, in which case monolithic queries can be constructed that
> are sufficiently performant.
>
> --
> Scott Robison
> _______________________________________________
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
> ___________________________________________
>  Gunter Hick
> Software Engineer
> Scientific Games International GmbH
> FN 157284 a, HG Wien
> Klitschgasse 2-4, A-1130 Vienna, Austria
> Tel: +43 1 80100 0
> E-Mail: hick at scigames.at
>
> This communication (including any attachments) is intended for the use of
> the intended recipient(s) only and may contain information that is
> confidential, privileged or legally protected. Any unauthorized use or
> dissemination of this communication is strictly prohibited. If you have
> received this communication in error, please immediately notify the sender
> by return e-mail message and delete all copies of the original
> communication. Thank you for your cooperation.
>
>
> _______________________________________________
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>

[sqlite] Suggestions for Fast Set Logic?

Reply via email to