Re: [sqlite] Ordering of fields in a join
Hi - > Adding the warning to the explain plan output should work well. It'd be a good place for it, I'd agree - it'd've saved me a trawl through the documentation and posting to this list! :) Thanks, Hamish ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Ordering of fields in a join
Adding the warning to the explain plan output should work well. ... And yet the coding mistake in the SQL query was very subtle. It makes me wonder if we shouldn't somehow come up with a "warning" mechanism in SQLite to give developers a heads-up on error-prone constructs, such as using == between two columns with different default collating sequences... Pavel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Ordering of fields in a join
On Thu, Oct 11, 2012 at 10:20 AM, Pavel Ivanovwrote: > On Thu, Oct 11, 2012 at 7:09 AM, Hamish Symington > wrote: > >> Note that I was mistaken earlier when I said that "X=Y" and "Y=X" should > >> result in the same answer. If X and Y have different default collating > >> sequences, then X=Y does not mean the same thing as Y=X and so the > answers > >> can be different. > > Right, I understand. Perhaps a brief note on the docs for the query > optimizer to remind people of this might be worth it - it seems slightly > counterintuitive, from a 'normal' user's point of view. > > Usage of appropriate collating function is not an optimizer feature, > it's property of SQLite data types. And it's well documented here > http://www.sqlite.org/datatype3.html (see section 6.1). Whether > optimizer chooses to use index or not depending on collating function > is also well documented. > > And yet the coding mistake in the SQL query was very subtle. It makes me wonder if we shouldn't somehow come up with a "warning" mechanism in SQLite to give developers a heads-up on error-prone constructs, such as using == between two columns with different default collating sequences... > Pavel > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Ordering of fields in a join
On Thu, Oct 11, 2012 at 7:09 AM, Hamish Symingtonwrote: >> Note that I was mistaken earlier when I said that "X=Y" and "Y=X" should >> result in the same answer. If X and Y have different default collating >> sequences, then X=Y does not mean the same thing as Y=X and so the answers >> can be different. > Right, I understand. Perhaps a brief note on the docs for the query optimizer > to remind people of this might be worth it - it seems slightly > counterintuitive, from a 'normal' user's point of view. Usage of appropriate collating function is not an optimizer feature, it's property of SQLite data types. And it's well documented here http://www.sqlite.org/datatype3.html (see section 6.1). Whether optimizer chooses to use index or not depending on collating function is also well documented. Pavel ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Ordering of fields in a join
Hello, > In your schema, SaleItem_SaleUUID collates using NOCASE and Picture_UUID > collates using BINARY. I think you mean SaleItem_PictureUUID, not SaleItem_SaleUUID, but yes. > Note that I was mistaken earlier when I said that "X=Y" and "Y=X" should > result in the same answer. If X and Y have different default collating > sequences, then X=Y does not mean the same thing as Y=X and so the answers > can be different. Right, I understand. Perhaps a brief note on the docs for the query optimizer to remind people of this might be worth it - it seems slightly counterintuitive, from a 'normal' user's point of view. Many thanks for the help - much appreciated. Best wishes, Hamish ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Ordering of fields in a join
On Wed, Oct 10, 2012 at 10:53 AM, Hamish Symington < ham...@lightbluesoftware.com> wrote: > > If I perform the query > > SELECT SaleItem.*, Picture_FileName FROM SaleItem LEFT JOIN Picture ON > SaleItem_PictureUUID=Picture_UUID WHERE SaleItem_SaleUUID = 'DAB8FE97- > D308-4809-B496-E55142DC05B5' > > the query is slow (c. 11 sec) > > but if I perform the exact same query but with the order of the terms in > the ON part switched > > SELECT SaleItem.*, Picture_FileName FROM SaleItem LEFT JOIN Picture ON > Picture_UUID=SaleItem_PictureUUID WHERE SaleItem_SaleUUID = 'DAB8FE97- > D308-4809-B496-E55142DC05B5' > > then the query is near instant. > When you are comparing two columns that have different collating sequences, the collating sequence of the left operand is used for the comparison. See http://www.sqlite.org/datatype3.html#collation for additional information. In your schema, SaleItem_SaleUUID collates using NOCASE and Picture_UUID collates using BINARY. Hence, the second query above (the fast query) uses BINARY for comparing the two values in the ON clause and is thus able to use the BINARY index that exists on Picture_UUID: CREATE INDEX idx_Picture_UUID ON Picture(Picture_UUID); The first query above (the slow query) compares the two values using NOCASE, and is thus unable to use the idx_Picture_UUID index. If you change your index to be: CREATE INDEX idx_Picture_UUID ON Picture(Picture_UUID COLLATE nocase); Then the first query will become fast and the second one will be slow. You can create two separate indices, one with COLLATE BINARY and the other with COLLATE NOCASE, and then both queries will be fast. Note that I was mistaken earlier when I said that "X=Y" and "Y=X" should result in the same answer. If X and Y have different default collating sequences, then X=Y does not mean the same thing as Y=X and so the answers can be different. You can add a COLLATE clause after a comparison operator to explicitly state the collating sequence to be used. For example, "(X=Y collate binary)" instead of just "X=Y". -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Ordering of fields in a join
On Thu, Oct 11, 2012 at 3:13 AM, Hamish Symington < ham...@lightbluesoftware.com> wrote: > > > If you run ANALYZE on your database, SQLite will have more information > with > > which to estimate the run-time of each plan, and is more likely to choose > > the faster on. At the very least, it much less likely for the estimated > > run-times to tie. Even better is if you compile with SQLITE_ENABLE_STAT3 > > and then run ANALYZE. > ANALYZE run; still the same result. > Please send me your complete schema and the content of the sqlite_stat1 and sqlite_stat3 tables and I will have a look. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Ordering of fields in a join
Hello, > Perhaps try the forthcoming 3.7.15 preview: > http://www.sqlite.org/sqlite3-20121009.zip > See "Please test the latest SQLite enhancements" for details: > http://comments.gmane.org/gmane.comp.db.sqlite.general/77259 Still present in that, as well. I should note (from further investigation) that this only occurs when COLLATE NOCASE is set on the fields I am joining on. My original CREATE TABLE statements should have read: CREATE TABLE SaleItem (SaleItem_ID integer NOT NULL PRIMARY KEY, SaleItem_UUID varchar NOT NULL DEFAULT '' COLLATE NOCASE, SaleItem_SaleUUID varchar NOT NULL DEFAULT '' COLLATE NOCASE, SaleItem_PictureUUID varchar NOT NULL DEFAULT '' COLLATE NOCASE); and CREATE TABLE Picture (Picture_ID integer NOT NULL PRIMARY KEY, Picture_UUID varchar NOT NULL DEFAULT '' COLLATE NOCASE, Picture_FileName varchar NOT NULL DEFAULT '' COLLATE NOCASE); I would speculate that the presence of a COLLATE is making it (incorrectly) believe that = is somehow no longer a symmetric relation (i.e. a = b doesn’t imply b = a). Best wishes, Hamish ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Ordering of fields in a join
Hi - >> I was under the impression that the order of fields in the ON clause >> doesn't matter. Is this impression incorrect? > It doesn't matter for the answer. You get the same result either way, > right? Just one way is faster than the other. True, I do get the same result. > If interchanging the order of terms in an ON clause makes a performance > difference, that means that there are (at least) two possible ways of > evaluating the query. SQLite is looking at both possible plans and > estimating the run-time for each, and it is coming up with the same > estimate in both cases - a tie. The ambiguity is resolved in favor of the > plan it looked at first. And the order in which the plans are examined is > probably determined (to some extent) by the order of the terms in the ON > clause. I'd be surprised if the engine was evaluating the same time for each, given that the slower one is a good few hundred times slower - unless there's something about evaluation which I've fundamentally misunderstood. > If you run ANALYZE on your database, SQLite will have more information with > which to estimate the run-time of each plan, and is more likely to choose > the faster on. At the very least, it much less likely for the estimated > run-times to tie. Even better is if you compile with SQLITE_ENABLE_STAT3 > and then run ANALYZE. ANALYZE run; still the same result. Thanks, Hamish > > > -- > D. Richard Hipp > d...@sqlite.org > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- Hamish Symington ham...@lightbluesoftware.com Twitter: @lighterblue Winner of the 2009 FileMaker CubeAwards 'Best Private Sector Solution' award Light Blue Software is a company registered in England and Wales, number 6671025. Our registered office is 101 Teversham Drift, Cambridge CB1 3LL. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Ordering of fields in a join
On Oct 10, 2012, at 7:30 PM, Hamish Symingtonwrote: >>> I was under the impression that the order of fields in the ON clause >>> doesn't matter. Is this impression incorrect? >> >> You should be correct, and the latest version of SQLite has some changes in >> which look relevant to this issue. Can you check which version of SQLite >> you're using and possibly download the latest version and see if this fixes >> your issue ? > > It doesn't - still present in 3.7.14.1. Perhaps try the forthcoming 3.7.15 preview: http://www.sqlite.org/sqlite3-20121009.zip See "Please test the latest SQLite enhancements" for details: http://comments.gmane.org/gmane.comp.db.sqlite.general/77259 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Ordering of fields in a join
On Wed, Oct 10, 2012 at 10:53 AM, Hamish Symington < ham...@lightbluesoftware.com> wrote: > > I was under the impression that the order of fields in the ON clause > doesn't matter. Is this impression incorrect? > It doesn't matter for the answer. You get the same result either way, right? Just one way is faster than the other. If interchanging the order of terms in an ON clause makes a performance difference, that means that there are (at least) two possible ways of evaluating the query. SQLite is looking at both possible plans and estimating the run-time for each, and it is coming up with the same estimate in both cases - a tie. The ambiguity is resolved in favor of the plan it looked at first. And the order in which the plans are examined is probably determined (to some extent) by the order of the terms in the ON clause. If you run ANALYZE on your database, SQLite will have more information with which to estimate the run-time of each plan, and is more likely to choose the faster on. At the very least, it much less likely for the estimated run-times to tie. Even better is if you compile with SQLITE_ENABLE_STAT3 and then run ANALYZE. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Ordering of fields in a join
>> I was under the impression that the order of fields in the ON clause doesn't >> matter. Is this impression incorrect? > > You should be correct, and the latest version of SQLite has some changes in > which look relevant to this issue. Can you check which version of SQLite > you're using and possibly download the latest version and see if this fixes > your issue ? It doesn't - still present in 3.7.14.1. Thanks, Hamish ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Ordering of fields in a join
On 10 Oct 2012, at 3:53pm, Hamish Symingtonwrote: > I was under the impression that the order of fields in the ON clause doesn't > matter. Is this impression incorrect? You should be correct, and the latest version of SQLite has some changes in which look relevant to this issue. Can you check which version of SQLite you're using and possibly download the latest version and see if this fixes your issue ? Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Ordering of fields in a join
Hello, I've made a new database with the latest version of SQLite, and added two tables, created thus: CREATE TABLE SaleItem (SaleItem_ID integer NOT NULL PRIMARY KEY, SaleItem_UUID varchar NOT NULL DEFAULT '', SaleItem_SaleUUID varchar NOT NULL DEFAULT '', SaleItem_PictureUUID varchar NOT NULL DEFAULT '' COLLATE NOCASE); and CREATE TABLE Picture (Picture_ID integer NOT NULL PRIMARY KEY, Picture_UUID varchar NOT NULL DEFAULT '', Picture_FileName varchar NOT NULL DEFAULT '' COLLATE NOCASE); I also have three indexes - one each on SaleItem_UUID, Picture_UUID and SaleItem_PictureUUID. In SaleItem, I have 8500 rows. In Picture, I have about 1 million. For the sake of ease, let's say that every four SaleItem rows have the same SaleItem_SaleUUID, and that each SaleItem has a different, random Picture_UUID in its SaleItem_PictureUUID field. If I perform the query SELECT SaleItem.*, Picture_FileName FROM SaleItem LEFT JOIN Picture ON SaleItem_PictureUUID=Picture_UUID WHERE SaleItem_SaleUUID = 'DAB8FE97- D308-4809-B496-E55142DC05B5' the query is slow (c. 11 sec) but if I perform the exact same query but with the order of the terms in the ON part switched SELECT SaleItem.*, Picture_FileName FROM SaleItem LEFT JOIN Picture ON Picture_UUID=SaleItem_PictureUUID WHERE SaleItem_SaleUUID = 'DAB8FE97- D308-4809-B496-E55142DC05B5' then the query is near instant. Doing an EXPLAIN QUERY PLAN gives me 0 0 0 SEARCH TABLE SaleItem USING INDEX idx_SaleItem_PictureUUID (SaleItem_SaleUUID=?) (~10 rows) 0 1 1 SEARCH TABLE Picture USING AUTOMATIC COVERING INDEX (Picture_UUID=?) (~10 rows) for the slow one and 0 0 0 SEARCH TABLE SaleItem USING INDEX idx_SaleItem_PictureUUID (SaleItem_SaleUUID=?) (~10 rows) 0 1 1 SEARCH TABLE Picture USING INDEX idx_Picture_UUID (Picture_UUID=?) (~10 rows) for the fast one. I was under the impression that the order of fields in the ON clause doesn't matter. Is this impression incorrect? Thanks, Hamish ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users