Re: [sqlite] Optimization opportunity
Am Fri, 14 Apr 2017 15:14:12 -0400 schrieb Richard Hipp: > But I've spent Good Friday working around it. A thousand thanks! :-) > Please try using the tip of the left-join-view branch > (https://www.sqlite.org/src/timeline?c=left-join-view) and let me know > if that version works better for you. After some additional testing, > this optimization will likely be merge to trunk and appear in the next > release. Your beta-testing is important - Thanks. Sadly my C skills are just good enough to compile the amalgamation. But as soon as the beta amalgamation will be available, I'll test it intensely, promised. Thanks again & happy Easter! :-) Wolfgang ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Optimization opportunity
On 4/14/17, Wolfgang Enzingerwrote: > > Thank you Richard. I have to admit that it took me quite a while and also > reading the comment for check-in [1838a59c] several times to really > understand your explanation. Duh, that's tricky indeed! > But I've spent Good Friday working around it. Please try using the tip of the left-join-view branch (https://www.sqlite.org/src/timeline?c=left-join-view) and let me know if that version works better for you. After some additional testing, this optimization will likely be merge to trunk and appear in the next release. Your beta-testing is important - Thanks. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Optimization opportunity
Am Fri, 14 Apr 2017 10:59:25 -0400 schrieb Richard Hipp: > Performing this rewrite of a view into a simple LEFT JOIN is trickier > than it seems at first glance. The rewrite works for the example you > provide. But subtle changes to the view can make the rewrite invalid. > For example: > > CREATE VIEW z AS SELECT >fk, > coalesce(flags&1,0) AS odd, -- Added coalesce() > (flags&2)>>1 AS even, > (flags&4)>>2 AS prime > FROM y; > > The addition of the coalesce() function on one of the result columns > of the view means that a transformation such as you suggest will give > a different (and incorrect) answer. This is just one of many examples > of the subtle pitfalls involved in trying to convert a LEFT JOIN into > a form that can make better use of indexes. Thank you Richard. I have to admit that it took me quite a while and also reading the comment for check-in [1838a59c] several times to really understand your explanation. Duh, that's tricky indeed! Wolfgang ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Optimization opportunity
On 4/14/17, Wolfgang Enzingerwrote: > > CREATE VIEW z AS SELECT > fk, > (flags&1) AS odd, > (flags&2)>>1 AS even, > (flags&4)>>2 AS prime > FROM y; > > Now using the VIEW z in a JOIN results in a full table scan on TABLE y > despite a WHERE clause and an appropriate INDEX: > > EXPLAIN QUERY PLAN SELECT x.pk,z.odd,z.even,z.prime > FROM x LEFT JOIN z ON x.pk=z.fk WHERE x.pk=2; > > Bypassing the VIEW however uses INDEX yy: > > EXPLAIN QUERY PLAN > SELECT x.pk,(y.flags&1) AS odd,(y.flags&2)>>1 AS even,(y.flags&4)>>2 AS > prime FROM x LEFT JOIN y ON x.pk=y.fk WHERE x.pk=2; > Performing this rewrite of a view into a simple LEFT JOIN is trickier than it seems at first glance. The rewrite works for the example you provide. But subtle changes to the view can make the rewrite invalid. For example: CREATE VIEW z AS SELECT fk, coalesce(flags&1,0) AS odd, -- Added coalesce() (flags&2)>>1 AS even, (flags&4)>>2 AS prime FROM y; The addition of the coalesce() function on one of the result columns of the view means that a transformation such as you suggest will give a different (and incorrect) answer. This is just one of many examples of the subtle pitfalls involved in trying to convert a LEFT JOIN into a form that can make better use of indexes. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Optimization opportunity
Hello ! Maybe this problem would be the reason of getting bad query plans when joining views too. Cheers ! On 14/04/17 08:03, Wolfgang Enzinger wrote: Hello, given the following: CREATE TABLE x( pk INTEGER PRIMARY KEY, description TEXT ); CREATE TABLE y( fk INTEGER REFERENCES x(pk), flags INTEGER ); CREATE INDEX yy ON y(fk); CREATE VIEW z AS SELECT fk, (flags&1) AS odd, (flags&2)>>1 AS even, (flags&4)>>2 AS prime FROM y; INSERT INTO x(pk,description) VALUES (1,'one'),(2,'two'),(3,'three'),(4,'four'); INSERT INTO y(fk,flags) VALUES (1,1|0|0),(2,0|2|4),(3,1|0|4),(4,0|2|0); Now using the VIEW z in a JOIN results in a full table scan on TABLE y despite a WHERE clause and an appropriate INDEX: EXPLAIN QUERY PLAN SELECT x.pk,z.odd,z.even,z.prime FROM x LEFT JOIN z ON x.pk=z.fk WHERE x.pk=2; 1|0|0|SCAN TABLE y 0|0|0|SEARCH TABLE x USING INTEGER PRIMARY KEY (rowid=?) 0|1|1|SCAN SUBQUERY 1 Bypassing the VIEW however uses INDEX yy: EXPLAIN QUERY PLAN SELECT x.pk,(y.flags&1) AS odd,(y.flags&2)>>1 AS even,(y.flags&4)>>2 AS prime FROM x LEFT JOIN y ON x.pk=y.fk WHERE x.pk=2; 0|0|0|SEARCH TABLE x USING INTEGER PRIMARY KEY (rowid=?) 0|1|1|SEARCH TABLE y USING INDEX yy (fk=?) Unless I'm missing something, I think there is a potential optimization opportunity. Identical results with SQLite versions 3.13, 3.17 and 3.18. Cheers, Wolfgang ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Optimization opportunity
Hello, given the following: CREATE TABLE x( pk INTEGER PRIMARY KEY, description TEXT ); CREATE TABLE y( fk INTEGER REFERENCES x(pk), flags INTEGER ); CREATE INDEX yy ON y(fk); CREATE VIEW z AS SELECT fk, (flags&1) AS odd, (flags&2)>>1 AS even, (flags&4)>>2 AS prime FROM y; INSERT INTO x(pk,description) VALUES (1,'one'),(2,'two'),(3,'three'),(4,'four'); INSERT INTO y(fk,flags) VALUES (1,1|0|0),(2,0|2|4),(3,1|0|4),(4,0|2|0); Now using the VIEW z in a JOIN results in a full table scan on TABLE y despite a WHERE clause and an appropriate INDEX: EXPLAIN QUERY PLAN SELECT x.pk,z.odd,z.even,z.prime FROM x LEFT JOIN z ON x.pk=z.fk WHERE x.pk=2; 1|0|0|SCAN TABLE y 0|0|0|SEARCH TABLE x USING INTEGER PRIMARY KEY (rowid=?) 0|1|1|SCAN SUBQUERY 1 Bypassing the VIEW however uses INDEX yy: EXPLAIN QUERY PLAN SELECT x.pk,(y.flags&1) AS odd,(y.flags&2)>>1 AS even,(y.flags&4)>>2 AS prime FROM x LEFT JOIN y ON x.pk=y.fk WHERE x.pk=2; 0|0|0|SEARCH TABLE x USING INTEGER PRIMARY KEY (rowid=?) 0|1|1|SEARCH TABLE y USING INDEX yy (fk=?) Unless I'm missing something, I think there is a potential optimization opportunity. Identical results with SQLite versions 3.13, 3.17 and 3.18. Cheers, Wolfgang ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Optimization Opportunity?
Op 13 mrt 2015, om 00:03 heeft Wolfgang Enzinger het volgende geschreven: > Am Sun, 8 Mar 2015 14:06:51 +0100 schrieb E.Pasma: > >> Actually query one appears slightly faster, >> Searching the PK index is faster as that is always a COVERING index. > > I was under the impression that the opposite is true, but I wasn't > sure > about that. > >> From the secunsary indexes only a part oh the key is used. >> Note there is not much use on adding PK as second column in the >> additional indexes. It is there anyway a a pointer to the row. > > You're right, that index doesn't make much sense; in my real > application it > looks different, what I was showing here was just an example (one > that was > not very well thought of, obviously). > >> I agree that it is strange that the execution plan for the two >> queries >> is different, After EXISTS the optimizer might ignore the expression >> in the select part of the sub-query. And Query one looks better as it >> soes not mention any column names. Personally I'd write SELECT NULL >> instead of SELECT *. > > I prefer "SELECT 1 ...", like in Gunter's post. But that's a matter of > taste, of course. > > Well, my actual point was that the query planner seems to > unnecessarily > visit the table row in order to read a column value that will be > discarded > lateron anyway, and that this could probably be optimized out > automatically. But my point is obsolete of course when the way it is > right > now is the faster one. Then again, it's not quite clear why this very > strategy is *not* chosen when "SELECT 1 ..." or similar is being > used. Not > a big deal indeed, just curious. > >> If speed matters instead of EXIST you can use IN and a list sub- >> query. >> This is superfast now: >> >> SELECT a1 FROM a WHERE a1 in (SELECT b.a1 FROM b INNER JOIN c >> USING(b1) WHERE c.c1=222); >> >> 0|0|0|SEARCH TABLE a USING INTEGER PRIMARY KEY (rowid=?) >> 0|0|0|EXECUTE LIST SUBQUERY 1 >> 1|0|1|SEARCH TABLE c USING INTEGER PRIMARY KEY (rowid=?) >> 1|1|0|SEARCH TABLE b USING INTEGER PRIMARY KEY (rowid=?) > > I avoided IN for a long time, but that must originate from the time > when I > mostly used Jet (Access) file databases ... with SQLite, it's really > fast > indeed. > > Wolfgang Excuses Wolfgang, my impression that the PK is automatically covering was wrong. I like to make an other correction: >> After EXISTS the optimizer might ignore the expression >> in the select part of the sub-query. This is not true if the expression is an aggrehate function. It remains true that there seems to be a optimization opportunity for query 2.
[sqlite] Optimization Opportunity?
Am Sun, 8 Mar 2015 14:06:51 +0100 schrieb E.Pasma: > Actually query one appears slightly faster, > Searching the PK index is faster as that is always a COVERING index. I was under the impression that the opposite is true, but I wasn't sure about that. > From the secunsary indexes only a part oh the key is used. > Note there is not much use on adding PK as second column in the > additional indexes. It is there anyway a a pointer to the row. You're right, that index doesn't make much sense; in my real application it looks different, what I was showing here was just an example (one that was not very well thought of, obviously). > I agree that it is strange that the execution plan for the two queries > is different, After EXISTS the optimizer might ignore the expression > in the select part of the sub-query. And Query one looks better as it > soes not mention any column names. Personally I'd write SELECT NULL > instead of SELECT *. I prefer "SELECT 1 ...", like in Gunter's post. But that's a matter of taste, of course. Well, my actual point was that the query planner seems to unnecessarily visit the table row in order to read a column value that will be discarded lateron anyway, and that this could probably be optimized out automatically. But my point is obsolete of course when the way it is right now is the faster one. Then again, it's not quite clear why this very strategy is *not* chosen when "SELECT 1 ..." or similar is being used. Not a big deal indeed, just curious. > If speed matters instead of EXIST you can use IN and a list sub-query. > This is superfast now: > > SELECT a1 FROM a WHERE a1 in (SELECT b.a1 FROM b INNER JOIN c > USING(b1) WHERE c.c1=222); > > 0|0|0|SEARCH TABLE a USING INTEGER PRIMARY KEY (rowid=?) > 0|0|0|EXECUTE LIST SUBQUERY 1 > 1|0|1|SEARCH TABLE c USING INTEGER PRIMARY KEY (rowid=?) > 1|1|0|SEARCH TABLE b USING INTEGER PRIMARY KEY (rowid=?) I avoided IN for a long time, but that must originate from the time when I mostly used Jet (Access) file databases ... with SQLite, it's really fast indeed. Wolfgang
[sqlite] Optimization Opportunity?
>> Actually query one appears slightly faster, >> Searching the PK index is faster as that is always a COVERING index. > >I was under the impression that the opposite is true, but I wasn't sure >about that. The primary key is only a covering index if you are only accessing fields comprising the primary key, or if the primary key is the rowid (in which case the primary key is the rowid of the entry in the table -- in which case a scan of the primary key index is the same as a scan of the table) or if the table was declared as without_rowid. >> From the secunsary indexes only a part oh the key is used. >> Note there is not much use on adding PK as second column in the >> additional indexes. It is there anyway a a pointer to the row. >You're right, that index doesn't make much sense; in my real application >it looks different, what I was showing here was just an example (one that >was not very well thought of, obviously). No. The rowid is contained in the index anyway and that is the primary key if and only if the table is declared such that the rowid is the primary key. Alternate keys (such as declare by any other method) are *NOT* included in any other index. If your primary key is declared on columns, then the rowid is what is stored in the index in addition to the named columns, in order to locate the table row. Unless of course your table is declared without rowid in which case the primary key is the declared key, not the rowid, and the declared primary key is stored in every index to allow you to located the corresponding table row. --- Theory is when you know everything but nothing works. Practice is when everything works but no one knows why. Sometimes theory and practice are combined: nothing works and no one knows why.
[sqlite] Optimization Opportunity?
I personally would use "... EXISTS ( SELECT 1 ...", which requires no extra columns to be acessed at all. -Urspr?ngliche Nachricht- Von: Wolfgang Enzinger [mailto:sqlite at enzinger.net] Gesendet: Samstag, 07. M?rz 2015 19:25 An: sqlite-users at mailinglists.sqlite.org Betreff: [sqlite] Optimization Opportunity? Hi dev team, not sure if this is actually a useful hint, but ... CREATE TABLE a(a1 INTEGER PRIMARY KEY); INSERT INTO a VALUES (1),(2),(3); CREATE TABLE b(a1 INTEGER REFERENCES a(a1),b1 INTEGER PRIMARY KEY); INSERT INTO b VALUES (1,11),(2,22),(3,33); CREATE UNIQUE INDEX b_ui ON b(a1,b1); CREATE TABLE c(b1 INTEGER REFERENCES b(b1),c1 INTEGER PRIMARY KEY,c2 TEXT); INSERT INTO c VALUES (11,111,'a'),(22,222,'b'),(33,333,'c'); CREATE UNIQUE INDEX c_ui ON c(b1,c1); ANALYZE; Query 1: EXPLAIN QUERY PLAN SELECT a1 FROM a WHERE EXISTS(SELECT * FROM b INNER JOIN c USING(b1) WHERE b.a1=a.a1 AND c.c1=222); selectidorder fromdetail 0 0 0 SCAN TABLE a 0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 1 1 0 1 SEARCH TABLE c USING INTEGER PRIMARY KEY (rowid=?) 1 1 0 SEARCH TABLE b USING INTEGER PRIMARY KEY (rowid=?) Query 2: EXPLAIN QUERY PLAN SELECT a1 FROM a WHERE EXISTS(SELECT c1 FROM b INNER JOIN c USING(b1) WHERE b.a1=a.a1 AND c.c1=222); selectidorder fromdetail 0 0 0 SCAN TABLE a 0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 1 1 0 0 SEARCH TABLE b USING COVERING INDEX b_ui (a1=?) 1 1 1 SEARCH TABLE c USING COVERING INDEX c_ui (b1=?) Note that the only difference between the two is "SELECT *" vs. "SELECT c1" within the EXISTS-block. The result is the same in both cases, however the second query uses COVERING INDEXes which should be more efficient (as far as I know). HTH; and sorry for the noise if not. Wolfgang ___ sqlite-users mailing list sqlite-users at mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ Gunter Hick Software Engineer Scientific Games International GmbH FN 157284 a, HG Wien Klitschgasse 2-4, A-1130 Vienna, Austria Tel: +43 1 80100 0 E-Mail: hick at scigames.at This communication (including any attachments) is intended for the use of the intended recipient(s) only and may contain information that is confidential, privileged or legally protected. Any unauthorized use or dissemination of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender by return e-mail message and delete all copies of the original communication. Thank you for your cooperation.
[sqlite] Optimization Opportunity?
Op 7 mrt 2015, om 19:24 heeft Wolfgang Enzinger het volgende geschreven: > Hi dev team, > > not sure if this is actually a useful hint, but ... > > CREATE TABLE a(a1 INTEGER PRIMARY KEY); > INSERT INTO a VALUES (1),(2),(3); > CREATE TABLE b(a1 INTEGER REFERENCES a(a1),b1 INTEGER PRIMARY KEY); > INSERT INTO b VALUES (1,11),(2,22),(3,33); > CREATE UNIQUE INDEX b_ui ON b(a1,b1); > CREATE TABLE c(b1 INTEGER REFERENCES b(b1),c1 INTEGER PRIMARY KEY,c2 > TEXT); > INSERT INTO c VALUES (11,111,'a'),(22,222,'b'),(33,333,'c'); > CREATE UNIQUE INDEX c_ui ON c(b1,c1); > ANALYZE; > > Query 1: > > EXPLAIN QUERY PLAN > SELECT a1 FROM a WHERE EXISTS(SELECT * FROM b INNER JOIN c USING(b1) > WHERE > b.a1=a.a1 AND c.c1=222); > > selectid order fromdetail > 0 0 0 SCAN TABLE a > 0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 1 > 1 0 1 SEARCH TABLE c USING INTEGER PRIMARY KEY (rowid=?) > 1 1 0 SEARCH TABLE b USING INTEGER PRIMARY KEY (rowid=?) > > Query 2: > > EXPLAIN QUERY PLAN > SELECT a1 FROM a WHERE EXISTS(SELECT c1 FROM b INNER JOIN c > USING(b1) WHERE > b.a1=a.a1 AND c.c1=222); > > selectid order fromdetail > 0 0 0 SCAN TABLE a > 0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 1 > 1 0 0 SEARCH TABLE b USING COVERING INDEX b_ui (a1=?) > 1 1 1 SEARCH TABLE c USING COVERING INDEX c_ui (b1=?) > > Note that the only difference between the two is "SELECT *" vs. > "SELECT c1" > within the EXISTS-block. The result is the same in both cases, > however the > second query uses COVERING INDEXes which should be more efficient > (as far > as I know). > > HTH; and sorry for the noise if not. > > Wolfgang Hello, as yiou gave a very clear example of the case, I dare to reply. Actually query one appears slightly faster, Searching the PK index is faster as that is always a COVERING index. From the secunsary indexes only a part oh the key is used. Note there is not much use on adding PK as second column in the additional indexes. It is there anyway a a pointer to the row. I agree that it is strange that the execution plan for the two queries is different, After EXISTS the optimizer might ignore the expression in the select part of the sub-query. And Query one looks better as it soes not mention any column names. Personally I'd write SELECT NULL instead of SELECT *. If speed matters instead of EXIST you can use IN and a list sub-query. This is superfast now: SELECT a1 FROM a WHERE a1 in (SELECT b.a1 FROM b INNER JOIN c USING(b1) WHERE c.c1=222); 0|0|0|SEARCH TABLE a USING INTEGER PRIMARY KEY (rowid=?) 0|0|0|EXECUTE LIST SUBQUERY 1 1|0|1|SEARCH TABLE c USING INTEGER PRIMARY KEY (rowid=?) 1|1|0|SEARCH TABLE b USING INTEGER PRIMARY KEY (rowid=?) Edzard Pasma
[sqlite] Optimization Opportunity?
Hi dev team, not sure if this is actually a useful hint, but ... CREATE TABLE a(a1 INTEGER PRIMARY KEY); INSERT INTO a VALUES (1),(2),(3); CREATE TABLE b(a1 INTEGER REFERENCES a(a1),b1 INTEGER PRIMARY KEY); INSERT INTO b VALUES (1,11),(2,22),(3,33); CREATE UNIQUE INDEX b_ui ON b(a1,b1); CREATE TABLE c(b1 INTEGER REFERENCES b(b1),c1 INTEGER PRIMARY KEY,c2 TEXT); INSERT INTO c VALUES (11,111,'a'),(22,222,'b'),(33,333,'c'); CREATE UNIQUE INDEX c_ui ON c(b1,c1); ANALYZE; Query 1: EXPLAIN QUERY PLAN SELECT a1 FROM a WHERE EXISTS(SELECT * FROM b INNER JOIN c USING(b1) WHERE b.a1=a.a1 AND c.c1=222); selectidorder fromdetail 0 0 0 SCAN TABLE a 0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 1 1 0 1 SEARCH TABLE c USING INTEGER PRIMARY KEY (rowid=?) 1 1 0 SEARCH TABLE b USING INTEGER PRIMARY KEY (rowid=?) Query 2: EXPLAIN QUERY PLAN SELECT a1 FROM a WHERE EXISTS(SELECT c1 FROM b INNER JOIN c USING(b1) WHERE b.a1=a.a1 AND c.c1=222); selectidorder fromdetail 0 0 0 SCAN TABLE a 0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 1 1 0 0 SEARCH TABLE b USING COVERING INDEX b_ui (a1=?) 1 1 1 SEARCH TABLE c USING COVERING INDEX c_ui (b1=?) Note that the only difference between the two is "SELECT *" vs. "SELECT c1" within the EXISTS-block. The result is the same in both cases, however the second query uses COVERING INDEXes which should be more efficient (as far as I know). HTH; and sorry for the noise if not. Wolfgang