Re: [sqlite] Query planner regression with FTS4: slower path is chosen
On 2/9/62 16:57, Paul wrote: I has been a while without response, so I just bumping this message. 19 July 2019, 14:21:27, by "Paul" : I have a test case when the regression can be observed in queries that use JOINs with FTS4 tables, somewhere in between 3.22.0 and 3.23.0. For some reason the planner decides to search non-FTS table first then scan the whole FTS table. Version 3.22.0 is the last unaffected, while issue is still present in HEAD. Probably it has something to do with a fact that, according to EXPLAIN, new version of planner ignores LEFT join and considers it just a JOIN. Suspect that that is the change. The LEFT JOIN is equivalent to a regular join in this case due to the "bar = 1" term in the WHERE clause. Running ANALYZE after the index is created in the example script causes SQLite to pick a better plan. Or, changing the LEFT JOIN to CROSS JOIN works to force SQLite to pick the plan you want. FTS5 does a little better with the query, but only because it runs faster - it still picks the slow plan. There might be room to improve this in FTS5, but probably not for FTS3/4, which are now focused very much on backwards compatibility. Dan. At least it feels that way, anyway. Test case: CREATE VIRTUAL TABLE search USING FTS4(text); WITH RECURSIVE cnt(x) AS ( SELECT 1 UNION ALL SELECT x+1 FROM cnt LIMIT 2000 ) INSERT INTO search(docid, text) SELECT x, 'test' || x FROM cnt; CREATE TABLE foo(s_docid integer primary key, bar integer); WITH RECURSIVE cnt(x) AS ( SELECT 1 UNION ALL SELECT x+1 FROM cnt LIMIT 2000 ) INSERT INTO foo(s_docid, bar) SELECT x, 1 FROM cnt; .timer on -- Fast SELECT COUNT() FROM search LEFT JOIN foo ON s_docid = docid WHERE bar = 1 AND search MATCH 'test*'; -- Fast SELECT COUNT() FROM foo WHERE bar = 1 AND s_docid IN ( SELECT docid FROM search WHERE search MATCH 'test*' ); -- Create index, as some real-life queries use searches by `bar` CREATE INDEX foo_bar_idx ON foo (bar); -- Slow SELECT COUNT() FROM search LEFT JOIN foo ON s_docid = docid WHERE bar = 1 AND search MATCH 'test*'; -- As fast as before (current workaround) SELECT COUNT() FROM foo WHERE bar = 1 AND s_docid IN ( SELECT docid FROM search WHERE search MATCH 'test*' ); ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Query planner regression with FTS4: slower path is chosen
I has been a while without response, so I just bumping this message. 19 July 2019, 14:21:27, by "Paul" : > I have a test case when the regression can be observed in queries that > use JOINs with FTS4 tables, somewhere in between 3.22.0 and 3.23.0. > For some reason the planner decides to search non-FTS table first then > scan the whole FTS table. Version 3.22.0 is the last unaffected, while > issue is still present in HEAD. > > Probably it has something to do with a fact that, according to EXPLAIN, > new version of planner ignores LEFT join and considers it just a JOIN. > At least it feels that way, anyway. > > Test case: > > > CREATE VIRTUAL TABLE search USING FTS4(text); > > WITH RECURSIVE > cnt(x) AS ( > SELECT 1 > UNION ALL > SELECT x+1 FROM cnt > LIMIT 2000 > ) > INSERT INTO search(docid, text) SELECT x, 'test' || x FROM cnt; > > CREATE TABLE foo(s_docid integer primary key, bar integer); > > WITH RECURSIVE > cnt(x) AS ( > SELECT 1 > UNION ALL > SELECT x+1 FROM cnt > LIMIT 2000 > ) > INSERT INTO foo(s_docid, bar) SELECT x, 1 FROM cnt; > > .timer on > > -- Fast > SELECT COUNT() FROM search LEFT JOIN foo > ON s_docid = docid > WHERE bar = 1 AND search MATCH 'test*'; > > -- Fast > SELECT COUNT() FROM foo > WHERE bar = 1 > AND s_docid IN ( >SELECT docid FROM search WHERE search MATCH 'test*' > ); > > -- Create index, as some real-life queries use searches by `bar` > CREATE INDEX foo_bar_idx ON foo (bar); > > -- Slow > SELECT COUNT() FROM search LEFT JOIN foo > ON s_docid = docid > WHERE bar = 1 AND search MATCH 'test*'; > > -- As fast as before (current workaround) > SELECT COUNT() FROM foo > WHERE bar = 1 > AND s_docid IN ( >SELECT docid FROM search WHERE search MATCH 'test*' > ); > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Query planner regression with FTS4: slower path is chosen
I have a test case when the regression can be observed in queries that use JOINs with FTS4 tables, somewhere in between 3.22.0 and 3.23.0. For some reason the planner decides to search non-FTS table first then scan the whole FTS table. Version 3.22.0 is the last unaffected, while issue is still present in HEAD. Probably it has something to do with a fact that, according to EXPLAIN, new version of planner ignores LEFT join and considers it just a JOIN. At least it feels that way, anyway. Test case: CREATE VIRTUAL TABLE search USING FTS4(text); WITH RECURSIVE cnt(x) AS ( SELECT 1 UNION ALL SELECT x+1 FROM cnt LIMIT 2000 ) INSERT INTO search(docid, text) SELECT x, 'test' || x FROM cnt; CREATE TABLE foo(s_docid integer primary key, bar integer); WITH RECURSIVE cnt(x) AS ( SELECT 1 UNION ALL SELECT x+1 FROM cnt LIMIT 2000 ) INSERT INTO foo(s_docid, bar) SELECT x, 1 FROM cnt; .timer on -- Fast SELECT COUNT() FROM search LEFT JOIN foo ON s_docid = docid WHERE bar = 1 AND search MATCH 'test*'; -- Fast SELECT COUNT() FROM foo WHERE bar = 1 AND s_docid IN ( SELECT docid FROM search WHERE search MATCH 'test*' ); -- Create index, as some real-life queries use searches by `bar` CREATE INDEX foo_bar_idx ON foo (bar); -- Slow SELECT COUNT() FROM search LEFT JOIN foo ON s_docid = docid WHERE bar = 1 AND search MATCH 'test*'; -- As fast as before (current workaround) SELECT COUNT() FROM foo WHERE bar = 1 AND s_docid IN ( SELECT docid FROM search WHERE search MATCH 'test*' ); ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users