On 2017/09/06 8:26 PM, Nico Williams wrote:
On Wed, Sep 06, 2017 at 08:05:57PM +0200, R Smith wrote:
-- Another interesting thing to note: The INTERSECT test produces ORDERED
-- output, which suggests that an ORDER-BY addition to the query would
-- favour the INTERSECT method.
Nothing about INTERSECT requires it to produce ordered output.
No, and it was not suggested, it was just noted that it does, suggesting
that it could be the more performant choice when adding an ORDER BY
clause, which turned out to not only be true in terms of being the
better choice, but also that it itself sped up by simply adding the
ORDER BY clause as was demonstrated in Test 6.
Nothing about the JOIN case makes it not possible to produce ordered
output by accident.
Yet it doesn't seem to by accident, which would suggest that an ORDER BY
clause when added to the JOIN statements would incur an additional time
penalty for having to actually order the results - Yet, as again
demonstrated in Test 6, the ORDER BY actually sped up the JOIN query too
(perhaps via forcing the Index earlier or used in a different way) -
which was most interesting, and, as you noted, there is nothing about
the JOIN that precludes it from having ordered output, so this
optimization might be worthwhile.
You'll want to re-measure with an ORDER BY added.
I did. It was done in Test 6. It showed significantly interesting
results. Was my explanation lacking in clarity or did it fall down the
TLDR; rabbit hole? :)
In any case, this is quite interesting. Many uses of JOIN are not
merely to filter results, but to construct joined result rows -- such
uses of JOIN cannot be optimized by using INTERSECT. But for
filter-uses of JOIN... this might be a useful optimization for the
engine to learn.
I agree, and not only the INTERSECT optimization but the tests suggest
adding a silent ORDER BY would also be an optimization, though not sure
if the effort-to-pleasure ratio is low enough yet. Perhaps if re-doing
the tests with tables using several more non-Integer columns to see if
the optimization could be generalized across all kinds of data in some
way. I might pursue this later when I have some time.
sqlite-users mailing list