Re: [sqlite] [EXTERNAL] Common subexpression optimization of deterministic functions

Dominique Devienne Thu, 14 Sep 2017 09:21:07 -0700

On Thu, Sep 14, 2017 at 5:38 PM, Warren Young <[email protected]> wrote:


> On Sep 14, 2017, at 8:49 AM, Dominique Devienne <[email protected]>
> wrote:
> >
> > On Thu, Sep 14, 2017 at 4:13 PM, Richard Hipp <[email protected]> wrote:
> >
> >> the amount of extra time spent inside of sqlite3_prepare() in order to
> >> deal with them is not worth the effort.
> >
> > But why not let the client code decide though? And have it off by
> default as now?
>
> Opportunity costs.  Time spent writing, testing, maintaining, and
> documenting this feature is time *not*

spent on features that will either make HWACI’s clients happy or expand
> that set of clients.
>
> If you take money out of it, it’s still time *not* spent writing code that
> makes drh and his collaborators happy.


I've never said otherwise. But my point is that it's not the rational Dr
Hipp used to not supporting this optimization.


>
> > but basing the decision to completely eschew this optimisation on the
> basis that a
> > non-exhaustive sampling of SQL statements shows it's rarely needed seems
> a bit
> > "weird" for lack of a better term on my part.
>
> Are you seriously saying that drh cannot possibly make this decision,
> lacking sufficient information?
>

Where did you read that? DRH is Mr SQLite, of course he has all discretion
to take it wherever he wants.
But Richard cannot possibly see all queries thrown at SQLite, that's just
not possible.


> Obviously he doesn’t see 100% of queries made through SQLite, but he
> probably sees more of them than any other person on the planet.  No person
> is in a better position to make a decision like this.
>

Again, I never said otherwise.


> But hey, if you feel you have a better feel for the truth of the matter,
> working code persuades best.


Even if I had it, DRH wouldn't want it. SQLite is not open to outside
contributions,
and for good reasons Richard re-iterated recently on this list.


>
> > Also, when you write that the query time gains are often offset by
> > additional prepare time
> > overhead (and memory as well, as you mentioned) kinda assumes a bad
> > practice of
> > not using (and caching) prepared statements, i.e. the prepare time
> overhead
> > (1x) could
> > well be worth it for an often-used (prepared/cached) query (Nx, with N
> > large).
>
> Up-thread, drh suggested that in the vast majority of cases he’s examined,
> each prepared statement is used once, then thrown away.

I don’t doubt that he’s seen a representative sample of the way
> applications are commonly written.
>

That's just shoddy DB programming IMHO. Basing any decision on the argument
that *prepared* statements
are used only once is just plain wrong. Or at the very least catering to
the lowest denominator of SQLite users.
By that token, taking this argument to its extreme, we wouldn't need
prepared statements at all...


>
> Doubtless some of those prepared statements could be cached longer, but
> that misses the point: those applications will not benefit from this
> optimization unless their SQLite driver code is rewritten.
>
> Isn’t the whole point of this argument to get more speed for free, without
> rewriting our applications?
>

SQLite driver code? Then you're thinking "scripting code". I don't need any
driver rewritten to use prepared statement, thank you very much.


> This implicit analogy with CSE optimizations in statically-compiled
> programming languages is inapt for that very reason: highly-optimizing
> compilers go to an awful lot of effort to ensure that you don’t have to
> write your code in the most optimal fashion to get the benefit of its
> optimizations.  If you make SQLite’s query parser equivalently smart, you
> may well eat up all the savings in the optimizer.
>
> All the examples I’ve seen attempting to support the value of this feature
> are simple enough that even a naive text compression algorithm could find
> the similarities and “hoist” the copies so the value is computed only
> once.  That means the *human* can also see the CSE and hoist it manually.
>
> The real trick SQLite would need to pull off to make this valuable is
> where mathematical analysis is required to recognize common
> subexpressions.  That’s where your favorite statically-compiled programming
> language gets most of its power, not in handling the trivial cases the
> human sees at first glance.  And that’s why your compiler takes so long to
> run even on a multigigahertz multicore box.
>
> How many milliseconds do the bulk of your SQLite statements run in?  Not
> the really heavy ones, I mean the 2-3 sigma span surrounding the mean.
> Let’s say it’s 1-10 ms.  Let’s also say we can only afford 1% overhead
> before the extra optimizer time spent on fast queries overwhelms the time
> saved on the heavy queries.  That means you’ve only got a few microseconds
> to do all this optimization, else you’ll blow the savings running the
> optimizer.
>
> I’d guess my C++ compiler on -O3 takes something like 100x as long to
> process the same number of bytes of input text as SQLite.  It doesn’t help
> to make SQLite queries prepare() 100x slower on the off chance that it can
> cut the execution time of a few particularly heavy queries in half.
>

First, you're taking that 100x out of your hat. Second, C++ compilers
process very large TUs once you fact in includes, so you may very well (and
probably are) quite wrong on this.

I have the utmost respect for Dr Hipp, and I'm a big fan of SQLite, but
that doesn't mean I wouldn't ask what is in my opinion reasonable remarks.
--DD
_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] [EXTERNAL] Common subexpression optimization of deterministic functions

Reply via email to