Re: [sqlite] Query parsing differences between v3.2.x and v3.3.x
Ah.. okay - you did not mention tables t1 and t2 were in different database files. That is likely why this case was never tested. Explicit cross joins are the way to go - you know the data better than the database. Read about SQLite's CROSS JOIN logic here: http://www.sqlite.org/cvstrac/wiki?p=QueryPlans "Note that you must use the keyword CROSS in order to disable the table reordering optimization. INNER JOIN, NATURAL JOIN, JOIN, and other similar combinations work just like a comma join in that the optimizer is free to reorder tables as it sees fit. (Table reordering is also disabled on an outer join, but that is because outer joins are not associative or commutative. Reordering tables in outer joins changes the result.) The use of the CROSS keyword to disable the table reordering by the optimizer is a planned feature for SQLite 3.2.6." --- Steve Green <[EMAIL PROTECTED]> wrote: > Interesting... I googled cross join and got the following definition: > > "A cross join (or Cartesian Product join) will return a result table where > each row from the first table is combined with each row from the second > table. The number of rows in the result table is the product of the number > of rows in each table. If the tables involved are large, this join can > take a very long time." > > So this isn't SQLite's definition of a cross join? > > > The schema is follows: > > table1.schema > > ATTACH DATABASE './table1.db' AS t1 ; > > CREATE TABLE t1.table1 > ( > i_id INT4, > b_id INT4, > d_id INT4, > c_id INT2, > data_in REAL, > data_out REAL > ); > CREATE INDEX t1.ix_table1_b_id ON table1( b_id ); > > DETACH DATABASE t1 ; > > table2.schema > > ATTACH DATABASE './table2.db' AS t2 ; > > CREATE TABLE t2.table2 > ( > d_id INT4 PRIMARY KEY, > r_id INT2, > m_id INT2, > i TEXT, > ct TEXT, > cc TEXT, > type TEXT, > notes TEXT > ); > > DETACH DATABASE t2 ; > > The actual query is > > selectt1.b_id, t1.c_id, t2.r_id, t2.m_id, >sum( t1.data_in ) as data_in, >sum( t1.data_out ) as data_out > from t1 > join t2 > ont2.d_id = t1.d_id and t1.b_id >= 100 and t1.b_id < 200 > group by t1.b_id, t1.c_id, t2.m_id, t2.r_id > > Is this enough information? > > Steve > > Joe Wilson wrote: > > > CROSS JOIN is an SQLite-specific thing to disable > > table join optimization. > > > > Please post the schema and indexes of yours tables > > so that this bug may be corrected when CROSS JOIN is not used. > > > > --- Steve Green <[EMAIL PROTECTED]> wrote: > > > > > >>Interestingly, using > >> > >>from t1 > >>cross join t2 > >> > >>fixes the problem; using this, causes indices from both tables to be used... > >> > >>addr opcode p1 p2 p3 > >> -- -- -- > >>- > >> > >>37OpenRead0 4 > >>38SetNumColumns 0 7 > >>39Integer 2 0 > >>40OpenRead3 6 keyinfo(1,BINARY) > >>41Integer 0 0 > >>42OpenRead1 2 > >>43SetNumColumns 1 3 > >>44Integer 0 0 > >>45OpenRead4 3 keyinfo(1,BINARY) > >> > >> > >>However, using > >> > >>from t2 > >>cross join t1 > >> > >>does *not* cause both indices to be used. > >> > >>As well, using > >> > >>from t1, t2 > >> > >>which is suppossed to be the same as > >> > >>from t1 > >>cross join t2 > >> > >>does *not* cause both indices to be used. > >> > >>This is all rather puzzling; I had also thought that using "join" was > >>equivalent > >>to "cross join", but this could be ignorance on my behalf... > >> > >>Steve > >> > >>Joe Wilson wrote: > >> > >> > >>>Hard to say what's the problem if you don't post the > >>>schema of the tables and the indexes, and provide some > >>>sample data. > >>> > >>>Perhaps there is a bug in the join optimizer. > >>>Try using an explicit CROSS JOIN. > >>> > >>>select t1.a, t1.b, t2.c, t2.d > >>>from t1 > >>>cross join t2 > >>>where t1.x = t2.x and t1.a >= 100 and t1.a < 200 > >>>group by t1.a, t1.b, t2.c, t2.d > >>> > >>> > >>>--- Steve Green <[EMAIL PROTECTED]> wrote: > >>> > >>> > >>> > The performance is exactly the same after running analyze on both tables. > > Steve > > Joe Wilson wrote: > > > > >Run an ANALYZE statement on your database and your > >queries will be fast once again. > > > >Perhaps SQLite should fall back to the old non-optimized > >join behavior when the sqlite_stat1 table is missing. > >This would cover 99% of the pre-SQLite3.2.3 legacy databases > >out there where the queries have already been hand-tuned. > > > >--- Steve Green <[EMAIL PROTECTED]> wrote: > > >
Re: [sqlite] Query parsing differences between v3.2.x and v3.3.x
Ah.. okay - you did not mention tables t1 and t2 were in different database files. That is likely why this case was never tested. Explicit cross joins are the way to go - you know the data better than the database. Read about SQLite's CROSS JOIN logic here: http://www.sqlite.org/cvstrac/wiki?p=QueryPlans "Note that you must use the keyword CROSS in order to disable the table reordering optimization. INNER JOIN, NATURAL JOIN, JOIN, and other similar combinations work just like a comma join in that the optimizer is free to reorder tables as it sees fit. (Table reordering is also disabled on an outer join, but that is because outer joins are not associative or commutative. Reordering tables in outer joins changes the result.) The use of the CROSS keyword to disable the table reordering by the optimizer is a planned feature for SQLite 3.2.6." --- Steve Green <[EMAIL PROTECTED]> wrote: > Interesting... I googled cross join and got the following definition: > > "A cross join (or Cartesian Product join) will return a result table where > each row from the first table is combined with each row from the second > table. The number of rows in the result table is the product of the number > of rows in each table. If the tables involved are large, this join can > take a very long time." > > So this isn't SQLite's definition of a cross join? > > > The schema is follows: > > table1.schema > > ATTACH DATABASE './table1.db' AS t1 ; > > CREATE TABLE t1.table1 > ( > i_id INT4, > b_id INT4, > d_id INT4, > c_id INT2, > data_in REAL, > data_out REAL > ); > CREATE INDEX t1.ix_table1_b_id ON table1( b_id ); > > DETACH DATABASE t1 ; > > table2.schema > > ATTACH DATABASE './table2.db' AS t2 ; > > CREATE TABLE t2.table2 > ( > d_id INT4 PRIMARY KEY, > r_id INT2, > m_id INT2, > i TEXT, > ct TEXT, > cc TEXT, > type TEXT, > notes TEXT > ); > > DETACH DATABASE t2 ; > > The actual query is > > selectt1.b_id, t1.c_id, t2.r_id, t2.m_id, >sum( t1.data_in ) as data_in, >sum( t1.data_out ) as data_out > from t1 > join t2 > ont2.d_id = t1.d_id and t1.b_id >= 100 and t1.b_id < 200 > group by t1.b_id, t1.c_id, t2.m_id, t2.r_id > > Is this enough information? > > Steve > > Joe Wilson wrote: > > > CROSS JOIN is an SQLite-specific thing to disable > > table join optimization. > > > > Please post the schema and indexes of yours tables > > so that this bug may be corrected when CROSS JOIN is not used. > > > > --- Steve Green <[EMAIL PROTECTED]> wrote: > > > > > >>Interestingly, using > >> > >>from t1 > >>cross join t2 > >> > >>fixes the problem; using this, causes indices from both tables to be used... > >> > >>addr opcode p1 p2 p3 > >> -- -- -- > >>- > >> > >>37OpenRead0 4 > >>38SetNumColumns 0 7 > >>39Integer 2 0 > >>40OpenRead3 6 keyinfo(1,BINARY) > >>41Integer 0 0 > >>42OpenRead1 2 > >>43SetNumColumns 1 3 > >>44Integer 0 0 > >>45OpenRead4 3 keyinfo(1,BINARY) > >> > >> > >>However, using > >> > >>from t2 > >>cross join t1 > >> > >>does *not* cause both indices to be used. > >> > >>As well, using > >> > >>from t1, t2 > >> > >>which is suppossed to be the same as > >> > >>from t1 > >>cross join t2 > >> > >>does *not* cause both indices to be used. > >> > >>This is all rather puzzling; I had also thought that using "join" was > >>equivalent > >>to "cross join", but this could be ignorance on my behalf... > >> > >>Steve > >> > >>Joe Wilson wrote: > >> > >> > >>>Hard to say what's the problem if you don't post the > >>>schema of the tables and the indexes, and provide some > >>>sample data. > >>> > >>>Perhaps there is a bug in the join optimizer. > >>>Try using an explicit CROSS JOIN. > >>> > >>>select t1.a, t1.b, t2.c, t2.d > >>>from t1 > >>>cross join t2 > >>>where t1.x = t2.x and t1.a >= 100 and t1.a < 200 > >>>group by t1.a, t1.b, t2.c, t2.d > >>> > >>> > >>>--- Steve Green <[EMAIL PROTECTED]> wrote: > >>> > >>> > >>> > The performance is exactly the same after running analyze on both tables. > > Steve > > Joe Wilson wrote: > > > > >Run an ANALYZE statement on your database and your > >queries will be fast once again. > > > >Perhaps SQLite should fall back to the old non-optimized > >join behavior when the sqlite_stat1 table is missing. > >This would cover 99% of the pre-SQLite3.2.3 legacy databases > >out there where the queries have already been hand-tuned. > > > >--- Steve Green <[EMAIL PROTECTED]> wrote: > > >
Re: [sqlite] Query parsing differences between v3.2.x and v3.3.x
Interesting... I googled cross join and got the following definition: "A cross join (or Cartesian Product join) will return a result table where each row from the first table is combined with each row from the second table. The number of rows in the result table is the product of the number of rows in each table. If the tables involved are large, this join can take a very long time." So this isn't SQLite's definition of a cross join? The schema is follows: table1.schema ATTACH DATABASE './table1.db' AS t1 ; CREATE TABLE t1.table1 ( i_id INT4, b_id INT4, d_id INT4, c_id INT2, data_in REAL, data_out REAL ); CREATE INDEX t1.ix_table1_b_id ON table1( b_id ); DETACH DATABASE t1 ; table2.schema ATTACH DATABASE './table2.db' AS t2 ; CREATE TABLE t2.table2 ( d_id INT4 PRIMARY KEY, r_id INT2, m_id INT2, i TEXT, ct TEXT, cc TEXT, type TEXT, notes TEXT ); DETACH DATABASE t2 ; The actual query is selectt1.b_id, t1.c_id, t2.r_id, t2.m_id, sum( t1.data_in ) as data_in, sum( t1.data_out ) as data_out from t1 join t2 ont2.d_id = t1.d_id and t1.b_id >= 100 and t1.b_id < 200 group by t1.b_id, t1.c_id, t2.m_id, t2.r_id Is this enough information? Steve Joe Wilson wrote: CROSS JOIN is an SQLite-specific thing to disable table join optimization. Please post the schema and indexes of yours tables so that this bug may be corrected when CROSS JOIN is not used. --- Steve Green <[EMAIL PROTECTED]> wrote: Interestingly, using from t1 cross join t2 fixes the problem; using this, causes indices from both tables to be used... addr opcode p1 p2 p3 -- -- -- - 37OpenRead0 4 38SetNumColumns 0 7 39Integer 2 0 40OpenRead3 6 keyinfo(1,BINARY) 41Integer 0 0 42OpenRead1 2 43SetNumColumns 1 3 44Integer 0 0 45OpenRead4 3 keyinfo(1,BINARY) However, using from t2 cross join t1 does *not* cause both indices to be used. As well, using from t1, t2 which is suppossed to be the same as from t1 cross join t2 does *not* cause both indices to be used. This is all rather puzzling; I had also thought that using "join" was equivalent to "cross join", but this could be ignorance on my behalf... Steve Joe Wilson wrote: Hard to say what's the problem if you don't post the schema of the tables and the indexes, and provide some sample data. Perhaps there is a bug in the join optimizer. Try using an explicit CROSS JOIN. select t1.a, t1.b, t2.c, t2.d from t1 cross join t2 where t1.x = t2.x and t1.a >= 100 and t1.a < 200 group by t1.a, t1.b, t2.c, t2.d --- Steve Green <[EMAIL PROTECTED]> wrote: The performance is exactly the same after running analyze on both tables. Steve Joe Wilson wrote: Run an ANALYZE statement on your database and your queries will be fast once again. Perhaps SQLite should fall back to the old non-optimized join behavior when the sqlite_stat1 table is missing. This would cover 99% of the pre-SQLite3.2.3 legacy databases out there where the queries have already been hand-tuned. --- Steve Green <[EMAIL PROTECTED]> wrote: Hi, I currently have a v3.2.0 database that contains two tables that I regularly query with a join, e.g., selectt1.a, t1.b, t2.c, t2.d from t1 join t2 ont1.x = t2.x and t1.a >= 100 and t1.a < 200 group by t1.a, t1.b, t2.c, t2.d Table t1 has an index on a Table t2 has an index on x I've created a new v3.3.4 database with the same schema and data. Table t1 has about 150 million rows and t2 has about 3000 rows. When I run the query using v3.2.0, it returns in about 0.1 seconds. When run using v3.3.4, it takes about 20 seconds. For completeness, I ran the same query using v3.2.8, v3.3.2, and v3.3.3. v3.2.8 performed the same (roughly) as v3.2.0; v3.3.2 and v3.3.3 performed the same as v3.3.4 (so, it appears that the change was made going into v3.3.x). When I ran an explain on the query in both databases, the main difference I noticed was in the opening of the database tables; in v3.2.0, it appears that the indices of both tables are being used... addr opcode p1 p2 p3 -- -- -- - 37OpenRead1 4 38SetNumColumns 1 7 39Integer 0 0 40OpenRead3 6 keyinfo(1,BINARY) 41Integer 2 0 42OpenRead0 2 43SetNumColumns 0 8 44Integer 2 0 45OpenRead4 3
Re: [sqlite] Query parsing differences between v3.2.x and v3.3.x
CROSS JOIN is an SQLite-specific thing to disable table join optimization. Please post the schema and indexes of yours tables so that this bug may be corrected when CROSS JOIN is not used. --- Steve Green <[EMAIL PROTECTED]> wrote: > Interestingly, using > > from t1 > cross join t2 > > fixes the problem; using this, causes indices from both tables to be used... > > addr opcode p1 p2 p3 > -- -- -- > - > > 37OpenRead0 4 > 38SetNumColumns 0 7 > 39Integer 2 0 > 40OpenRead3 6 keyinfo(1,BINARY) > 41Integer 0 0 > 42OpenRead1 2 > 43SetNumColumns 1 3 > 44Integer 0 0 > 45OpenRead4 3 keyinfo(1,BINARY) > > > However, using > > from t2 > cross join t1 > > does *not* cause both indices to be used. > > As well, using > > from t1, t2 > > which is suppossed to be the same as > > from t1 > cross join t2 > > does *not* cause both indices to be used. > > This is all rather puzzling; I had also thought that using "join" was > equivalent > to "cross join", but this could be ignorance on my behalf... > > Steve > > Joe Wilson wrote: > > > Hard to say what's the problem if you don't post the > > schema of the tables and the indexes, and provide some > > sample data. > > > > Perhaps there is a bug in the join optimizer. > > Try using an explicit CROSS JOIN. > > > > select t1.a, t1.b, t2.c, t2.d > > from t1 > > cross join t2 > > where t1.x = t2.x and t1.a >= 100 and t1.a < 200 > > group by t1.a, t1.b, t2.c, t2.d > > > > > > --- Steve Green <[EMAIL PROTECTED]> wrote: > > > > > >>The performance is exactly the same after running analyze on both tables. > >> > >>Steve > >> > >>Joe Wilson wrote: > >> > >> > >>>Run an ANALYZE statement on your database and your > >>>queries will be fast once again. > >>> > >>>Perhaps SQLite should fall back to the old non-optimized > >>>join behavior when the sqlite_stat1 table is missing. > >>>This would cover 99% of the pre-SQLite3.2.3 legacy databases > >>>out there where the queries have already been hand-tuned. > >>> > >>>--- Steve Green <[EMAIL PROTECTED]> wrote: > >>> > >>> > >>> > Hi, > > I currently have a v3.2.0 database that contains two tables that I > regularly > query with a join, e.g., > > selectt1.a, t1.b, t2.c, t2.d > >>> > from t1 > >>> > join t2 > ont1.x = t2.x and t1.a >= 100 and t1.a < 200 > group by t1.a, t1.b, t2.c, t2.d > > Table t1 has an index on a > Table t2 has an index on x > > I've created a new v3.3.4 database with the same schema and data. Table > t1 has > about 150 million rows and t2 has about 3000 rows. When I run the query > using > v3.2.0, it returns in about 0.1 seconds. When run using v3.3.4, it takes > about > 20 seconds. For completeness, I ran the same query using v3.2.8, v3.3.2, > and > v3.3.3. v3.2.8 performed the same (roughly) as v3.2.0; v3.3.2 and v3.3.3 > performed > the same as v3.3.4 (so, it appears that the change was made going into > v3.3.x). > > When I ran an explain on the query in both databases, the main difference > I > noticed was in the opening of the database tables; in v3.2.0, it appears > that > the indices of both tables are being used... > > addr opcode p1 p2 p3 > -- -- -- > - > > 37OpenRead1 4 > 38SetNumColumns 1 7 > 39Integer 0 0 > 40OpenRead3 6 keyinfo(1,BINARY) > 41Integer 2 0 > 42OpenRead0 2 > 43SetNumColumns 0 8 > 44Integer 2 0 > 45OpenRead4 3 keyinfo(1,BINARY) > 46Integer 3795633 0 > > > In v3.3.4, it appears that the index of only one of the tables is being > used... > > addr opcode p1 p2 p3 > -- -- -- > - > > 37OpenRead0 2 > 38SetNumColumns 0 3 > 39Integer 0 0 > 40OpenRead1 4 > 41SetNumColumns 1 7 > 42Integer 0 0 > 43OpenRead3 6 keyinfo(1,BINARY) > 44Rewind 0 78 > 45Integer 3795633 0 > > > It's quite possible that I'm misinterpreting the explain
Re: [sqlite] Query parsing differences between v3.2.x and v3.3.x
Interestingly, using from t1 cross join t2 fixes the problem; using this, causes indices from both tables to be used... addr opcode p1 p2 p3 -- -- -- - 37OpenRead0 4 38SetNumColumns 0 7 39Integer 2 0 40OpenRead3 6 keyinfo(1,BINARY) 41Integer 0 0 42OpenRead1 2 43SetNumColumns 1 3 44Integer 0 0 45OpenRead4 3 keyinfo(1,BINARY) However, using from t2 cross join t1 does *not* cause both indices to be used. As well, using from t1, t2 which is suppossed to be the same as from t1 cross join t2 does *not* cause both indices to be used. This is all rather puzzling; I had also thought that using "join" was equivalent to "cross join", but this could be ignorance on my behalf... Steve Joe Wilson wrote: Hard to say what's the problem if you don't post the schema of the tables and the indexes, and provide some sample data. Perhaps there is a bug in the join optimizer. Try using an explicit CROSS JOIN. select t1.a, t1.b, t2.c, t2.d from t1 cross join t2 where t1.x = t2.x and t1.a >= 100 and t1.a < 200 group by t1.a, t1.b, t2.c, t2.d --- Steve Green <[EMAIL PROTECTED]> wrote: The performance is exactly the same after running analyze on both tables. Steve Joe Wilson wrote: Run an ANALYZE statement on your database and your queries will be fast once again. Perhaps SQLite should fall back to the old non-optimized join behavior when the sqlite_stat1 table is missing. This would cover 99% of the pre-SQLite3.2.3 legacy databases out there where the queries have already been hand-tuned. --- Steve Green <[EMAIL PROTECTED]> wrote: Hi, I currently have a v3.2.0 database that contains two tables that I regularly query with a join, e.g., selectt1.a, t1.b, t2.c, t2.d from t1 join t2 ont1.x = t2.x and t1.a >= 100 and t1.a < 200 group by t1.a, t1.b, t2.c, t2.d Table t1 has an index on a Table t2 has an index on x I've created a new v3.3.4 database with the same schema and data. Table t1 has about 150 million rows and t2 has about 3000 rows. When I run the query using v3.2.0, it returns in about 0.1 seconds. When run using v3.3.4, it takes about 20 seconds. For completeness, I ran the same query using v3.2.8, v3.3.2, and v3.3.3. v3.2.8 performed the same (roughly) as v3.2.0; v3.3.2 and v3.3.3 performed the same as v3.3.4 (so, it appears that the change was made going into v3.3.x). When I ran an explain on the query in both databases, the main difference I noticed was in the opening of the database tables; in v3.2.0, it appears that the indices of both tables are being used... addr opcode p1 p2 p3 -- -- -- - 37OpenRead1 4 38SetNumColumns 1 7 39Integer 0 0 40OpenRead3 6 keyinfo(1,BINARY) 41Integer 2 0 42OpenRead0 2 43SetNumColumns 0 8 44Integer 2 0 45OpenRead4 3 keyinfo(1,BINARY) 46Integer 3795633 0 In v3.3.4, it appears that the index of only one of the tables is being used... addr opcode p1 p2 p3 -- -- -- - 37OpenRead0 2 38SetNumColumns 0 3 39Integer 0 0 40OpenRead1 4 41SetNumColumns 1 7 42Integer 0 0 43OpenRead3 6 keyinfo(1,BINARY) 44Rewind 0 78 45Integer 3795633 0 It's quite possible that I'm misinterpreting the explain data as I'm fairly new with it... However, it seems to me that this might explain the huge difference in performance. Has anyone else seen this type of performance difference between v3.2.x and v3.3.x? Any ideas on how we can improve the performance using v3.3.x (we'd really like to use the most current version of the database, but the performance difference would be a killer)? Thanks, Steve __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com -- Steve Green SAVVIS Transforming Information Technology SM This message contains information which may be confidential and/or privileged. Unless you are the intended recipient (or authorized to receive for the intended recipient), you may not read, use, copy or disclose to anyone the message or any information contained in the message. If you have
Re: [sqlite] Query parsing differences between v3.2.x and v3.3.x
maybe "t2 cross join t1" makes more sense given the sizes of the tables. --- Joe Wilson <[EMAIL PROTECTED]> wrote: > Hard to say what's the problem if you don't post the > schema of the tables and the indexes, and provide some > sample data. > > Perhaps there is a bug in the join optimizer. > Try using an explicit CROSS JOIN. > > select t1.a, t1.b, t2.c, t2.d > from t1 > cross join t2 > where t1.x = t2.x and t1.a >= 100 and t1.a < 200 > group by t1.a, t1.b, t2.c, t2.d > > > --- Steve Green <[EMAIL PROTECTED]> wrote: > > > The performance is exactly the same after running analyze on both tables. > > > > Steve > > > > Joe Wilson wrote: > > > > > Run an ANALYZE statement on your database and your > > > queries will be fast once again. > > > > > > Perhaps SQLite should fall back to the old non-optimized > > > join behavior when the sqlite_stat1 table is missing. > > > This would cover 99% of the pre-SQLite3.2.3 legacy databases > > > out there where the queries have already been hand-tuned. > > > > > > --- Steve Green <[EMAIL PROTECTED]> wrote: > > > > > > > > >>Hi, > > >> > > >>I currently have a v3.2.0 database that contains two tables that I > > >>regularly > > >>query with a join, e.g., > > >> > > >>selectt1.a, t1.b, t2.c, t2.d > > >>from t1 > > >>join t2 > > >>ont1.x = t2.x and t1.a >= 100 and t1.a < 200 > > >>group by t1.a, t1.b, t2.c, t2.d > > >> > > >>Table t1 has an index on a > > >>Table t2 has an index on x > > >> > > >>I've created a new v3.3.4 database with the same schema and data. Table > > >>t1 has > > >>about 150 million rows and t2 has about 3000 rows. When I run the query > > >>using > > >>v3.2.0, it returns in about 0.1 seconds. When run using v3.3.4, it takes > > >>about > > >>20 seconds. For completeness, I ran the same query using v3.2.8, v3.3.2, > > >>and > > >>v3.3.3. v3.2.8 performed the same (roughly) as v3.2.0; v3.3.2 and v3.3.3 > > >>performed > > >>the same as v3.3.4 (so, it appears that the change was made going into > > >>v3.3.x). > > >> > > >>When I ran an explain on the query in both databases, the main difference > > >>I > > >>noticed was in the opening of the database tables; in v3.2.0, it appears > > >>that > > >>the indices of both tables are being used... > > >> > > >>addr opcode p1 p2 p3 > > >> -- -- -- > > >>- > > >> > > >>37OpenRead1 4 > > >>38SetNumColumns 1 7 > > >>39Integer 0 0 > > >>40OpenRead3 6 keyinfo(1,BINARY) > > >>41Integer 2 0 > > >>42OpenRead0 2 > > >>43SetNumColumns 0 8 > > >>44Integer 2 0 > > >>45OpenRead4 3 keyinfo(1,BINARY) > > >>46Integer 3795633 0 > > >> > > >> > > >>In v3.3.4, it appears that the index of only one of the tables is being > > >>used... > > >> > > >>addr opcode p1 p2 p3 > > >> -- -- -- > > >>- > > >> > > >>37OpenRead0 2 > > >>38SetNumColumns 0 3 > > >>39Integer 0 0 > > >>40OpenRead1 4 > > >>41SetNumColumns 1 7 > > >>42Integer 0 0 > > >>43OpenRead3 6 keyinfo(1,BINARY) > > >>44Rewind 0 78 > > >>45Integer 3795633 0 > > >> > > >> > > >>It's quite possible that I'm misinterpreting the explain data as I'm > > >>fairly new > > >>with it... However, it seems to me that this might explain the huge > > >>difference > > >>in performance. > > >> > > >>Has anyone else seen this type of performance difference between v3.2.x > > >>and v3.3.x? > > >>Any ideas on how we can improve the performance using v3.3.x (we'd really > > >>like to > > >>use the most current version of the database, but the performance > > >>difference would > > >>be a killer)? > > >> > > >>Thanks, > > >> > > >>Steve > > > > > > > > > > > > > > > __ > > > Do You Yahoo!? > > > Tired of spam? Yahoo! Mail has the best spam protection around > > > http://mail.yahoo.com > > > > -- > > Steve Green > > SAVVIS > > Transforming Information Technology SM > > > > This message contains information which may be confidential and/or > > privileged. Unless you are the intended recipient (or authorized > > to receive for the intended recipient), you may not read, use, > > copy or disclose to anyone the message or any information contained > > in the message. If you have received the message in error, please > > advise the sender by reply e-mail at [EMAIL PROTECTED] and > > delete the message and any attachment(s) thereto without retaining > > any copies. > > > > >
Re: [sqlite] Query parsing differences between v3.2.x and v3.3.x
Hard to say what's the problem if you don't post the schema of the tables and the indexes, and provide some sample data. Perhaps there is a bug in the join optimizer. Try using an explicit CROSS JOIN. select t1.a, t1.b, t2.c, t2.d from t1 cross join t2 where t1.x = t2.x and t1.a >= 100 and t1.a < 200 group by t1.a, t1.b, t2.c, t2.d --- Steve Green <[EMAIL PROTECTED]> wrote: > The performance is exactly the same after running analyze on both tables. > > Steve > > Joe Wilson wrote: > > > Run an ANALYZE statement on your database and your > > queries will be fast once again. > > > > Perhaps SQLite should fall back to the old non-optimized > > join behavior when the sqlite_stat1 table is missing. > > This would cover 99% of the pre-SQLite3.2.3 legacy databases > > out there where the queries have already been hand-tuned. > > > > --- Steve Green <[EMAIL PROTECTED]> wrote: > > > > > >>Hi, > >> > >>I currently have a v3.2.0 database that contains two tables that I regularly > >>query with a join, e.g., > >> > >>selectt1.a, t1.b, t2.c, t2.d > >>from t1 > >>join t2 > >>ont1.x = t2.x and t1.a >= 100 and t1.a < 200 > >>group by t1.a, t1.b, t2.c, t2.d > >> > >>Table t1 has an index on a > >>Table t2 has an index on x > >> > >>I've created a new v3.3.4 database with the same schema and data. Table t1 > >>has > >>about 150 million rows and t2 has about 3000 rows. When I run the query > >>using > >>v3.2.0, it returns in about 0.1 seconds. When run using v3.3.4, it takes > >>about > >>20 seconds. For completeness, I ran the same query using v3.2.8, v3.3.2, > >>and > >>v3.3.3. v3.2.8 performed the same (roughly) as v3.2.0; v3.3.2 and v3.3.3 > >>performed > >>the same as v3.3.4 (so, it appears that the change was made going into > >>v3.3.x). > >> > >>When I ran an explain on the query in both databases, the main difference I > >>noticed was in the opening of the database tables; in v3.2.0, it appears > >>that > >>the indices of both tables are being used... > >> > >>addr opcode p1 p2 p3 > >> -- -- -- > >>- > >> > >>37OpenRead1 4 > >>38SetNumColumns 1 7 > >>39Integer 0 0 > >>40OpenRead3 6 keyinfo(1,BINARY) > >>41Integer 2 0 > >>42OpenRead0 2 > >>43SetNumColumns 0 8 > >>44Integer 2 0 > >>45OpenRead4 3 keyinfo(1,BINARY) > >>46Integer 3795633 0 > >> > >> > >>In v3.3.4, it appears that the index of only one of the tables is being > >>used... > >> > >>addr opcode p1 p2 p3 > >> -- -- -- > >>- > >> > >>37OpenRead0 2 > >>38SetNumColumns 0 3 > >>39Integer 0 0 > >>40OpenRead1 4 > >>41SetNumColumns 1 7 > >>42Integer 0 0 > >>43OpenRead3 6 keyinfo(1,BINARY) > >>44Rewind 0 78 > >>45Integer 3795633 0 > >> > >> > >>It's quite possible that I'm misinterpreting the explain data as I'm fairly > >>new > >>with it... However, it seems to me that this might explain the huge > >>difference > >>in performance. > >> > >>Has anyone else seen this type of performance difference between v3.2.x and > >>v3.3.x? > >>Any ideas on how we can improve the performance using v3.3.x (we'd really > >>like to > >>use the most current version of the database, but the performance > >>difference would > >>be a killer)? > >> > >>Thanks, > >> > >>Steve > > > > > > > > > > __ > > Do You Yahoo!? > > Tired of spam? Yahoo! Mail has the best spam protection around > > http://mail.yahoo.com > > -- > Steve Green > SAVVIS > Transforming Information Technology SM > > This message contains information which may be confidential and/or > privileged. Unless you are the intended recipient (or authorized > to receive for the intended recipient), you may not read, use, > copy or disclose to anyone the message or any information contained > in the message. If you have received the message in error, please > advise the sender by reply e-mail at [EMAIL PROTECTED] and > delete the message and any attachment(s) thereto without retaining > any copies. > __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: [sqlite] Query parsing differences between v3.2.x and v3.3.x
The performance is exactly the same after running analyze on both tables. Steve Joe Wilson wrote: Run an ANALYZE statement on your database and your queries will be fast once again. Perhaps SQLite should fall back to the old non-optimized join behavior when the sqlite_stat1 table is missing. This would cover 99% of the pre-SQLite3.2.3 legacy databases out there where the queries have already been hand-tuned. --- Steve Green <[EMAIL PROTECTED]> wrote: Hi, I currently have a v3.2.0 database that contains two tables that I regularly query with a join, e.g., selectt1.a, t1.b, t2.c, t2.d from t1 join t2 ont1.x = t2.x and t1.a >= 100 and t1.a < 200 group by t1.a, t1.b, t2.c, t2.d Table t1 has an index on a Table t2 has an index on x I've created a new v3.3.4 database with the same schema and data. Table t1 has about 150 million rows and t2 has about 3000 rows. When I run the query using v3.2.0, it returns in about 0.1 seconds. When run using v3.3.4, it takes about 20 seconds. For completeness, I ran the same query using v3.2.8, v3.3.2, and v3.3.3. v3.2.8 performed the same (roughly) as v3.2.0; v3.3.2 and v3.3.3 performed the same as v3.3.4 (so, it appears that the change was made going into v3.3.x). When I ran an explain on the query in both databases, the main difference I noticed was in the opening of the database tables; in v3.2.0, it appears that the indices of both tables are being used... addr opcode p1 p2 p3 -- -- -- - 37OpenRead1 4 38SetNumColumns 1 7 39Integer 0 0 40OpenRead3 6 keyinfo(1,BINARY) 41Integer 2 0 42OpenRead0 2 43SetNumColumns 0 8 44Integer 2 0 45OpenRead4 3 keyinfo(1,BINARY) 46Integer 3795633 0 In v3.3.4, it appears that the index of only one of the tables is being used... addr opcode p1 p2 p3 -- -- -- - 37OpenRead0 2 38SetNumColumns 0 3 39Integer 0 0 40OpenRead1 4 41SetNumColumns 1 7 42Integer 0 0 43OpenRead3 6 keyinfo(1,BINARY) 44Rewind 0 78 45Integer 3795633 0 It's quite possible that I'm misinterpreting the explain data as I'm fairly new with it... However, it seems to me that this might explain the huge difference in performance. Has anyone else seen this type of performance difference between v3.2.x and v3.3.x? Any ideas on how we can improve the performance using v3.3.x (we'd really like to use the most current version of the database, but the performance difference would be a killer)? Thanks, Steve __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com -- Steve Green SAVVIS Transforming Information Technology SM This message contains information which may be confidential and/or privileged. Unless you are the intended recipient (or authorized to receive for the intended recipient), you may not read, use, copy or disclose to anyone the message or any information contained in the message. If you have received the message in error, please advise the sender by reply e-mail at [EMAIL PROTECTED] and delete the message and any attachment(s) thereto without retaining any copies.
Re: [sqlite] Query parsing differences between v3.2.x and v3.3.x
Run an ANALYZE statement on your database and your queries will be fast once again. Perhaps SQLite should fall back to the old non-optimized join behavior when the sqlite_stat1 table is missing. This would cover 99% of the pre-SQLite3.2.3 legacy databases out there where the queries have already been hand-tuned. --- Steve Green <[EMAIL PROTECTED]> wrote: > Hi, > > I currently have a v3.2.0 database that contains two tables that I regularly > query with a join, e.g., > > selectt1.a, t1.b, t2.c, t2.d > from t1 > join t2 > ont1.x = t2.x and t1.a >= 100 and t1.a < 200 > group by t1.a, t1.b, t2.c, t2.d > > Table t1 has an index on a > Table t2 has an index on x > > I've created a new v3.3.4 database with the same schema and data. Table t1 > has > about 150 million rows and t2 has about 3000 rows. When I run the query using > v3.2.0, it returns in about 0.1 seconds. When run using v3.3.4, it takes > about > 20 seconds. For completeness, I ran the same query using v3.2.8, v3.3.2, and > v3.3.3. v3.2.8 performed the same (roughly) as v3.2.0; v3.3.2 and v3.3.3 > performed > the same as v3.3.4 (so, it appears that the change was made going into > v3.3.x). > > When I ran an explain on the query in both databases, the main difference I > noticed was in the opening of the database tables; in v3.2.0, it appears that > the indices of both tables are being used... > > addr opcode p1 p2 p3 > -- -- -- > - > > 37OpenRead1 4 > 38SetNumColumns 1 7 > 39Integer 0 0 > 40OpenRead3 6 keyinfo(1,BINARY) > 41Integer 2 0 > 42OpenRead0 2 > 43SetNumColumns 0 8 > 44Integer 2 0 > 45OpenRead4 3 keyinfo(1,BINARY) > 46Integer 3795633 0 > > > In v3.3.4, it appears that the index of only one of the tables is being > used... > > addr opcode p1 p2 p3 > -- -- -- > - > > 37OpenRead0 2 > 38SetNumColumns 0 3 > 39Integer 0 0 > 40OpenRead1 4 > 41SetNumColumns 1 7 > 42Integer 0 0 > 43OpenRead3 6 keyinfo(1,BINARY) > 44Rewind 0 78 > 45Integer 3795633 0 > > > It's quite possible that I'm misinterpreting the explain data as I'm fairly > new > with it... However, it seems to me that this might explain the huge difference > in performance. > > Has anyone else seen this type of performance difference between v3.2.x and > v3.3.x? > Any ideas on how we can improve the performance using v3.3.x (we'd really > like to > use the most current version of the database, but the performance difference > would > be a killer)? > > Thanks, > > Steve __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
[sqlite] Query parsing differences between v3.2.x and v3.3.x
Hi, I currently have a v3.2.0 database that contains two tables that I regularly query with a join, e.g., selectt1.a, t1.b, t2.c, t2.d from t1 join t2 ont1.x = t2.x and t1.a >= 100 and t1.a < 200 group by t1.a, t1.b, t2.c, t2.d Table t1 has an index on a Table t2 has an index on x I've created a new v3.3.4 database with the same schema and data. Table t1 has about 150 million rows and t2 has about 3000 rows. When I run the query using v3.2.0, it returns in about 0.1 seconds. When run using v3.3.4, it takes about 20 seconds. For completeness, I ran the same query using v3.2.8, v3.3.2, and v3.3.3. v3.2.8 performed the same (roughly) as v3.2.0; v3.3.2 and v3.3.3 performed the same as v3.3.4 (so, it appears that the change was made going into v3.3.x). When I ran an explain on the query in both databases, the main difference I noticed was in the opening of the database tables; in v3.2.0, it appears that the indices of both tables are being used... addr opcode p1 p2 p3 -- -- -- - 37OpenRead1 4 38SetNumColumns 1 7 39Integer 0 0 40OpenRead3 6 keyinfo(1,BINARY) 41Integer 2 0 42OpenRead0 2 43SetNumColumns 0 8 44Integer 2 0 45OpenRead4 3 keyinfo(1,BINARY) 46Integer 3795633 0 In v3.3.4, it appears that the index of only one of the tables is being used... addr opcode p1 p2 p3 -- -- -- - 37OpenRead0 2 38SetNumColumns 0 3 39Integer 0 0 40OpenRead1 4 41SetNumColumns 1 7 42Integer 0 0 43OpenRead3 6 keyinfo(1,BINARY) 44Rewind 0 78 45Integer 3795633 0 It's quite possible that I'm misinterpreting the explain data as I'm fairly new with it... However, it seems to me that this might explain the huge difference in performance. Has anyone else seen this type of performance difference between v3.2.x and v3.3.x? Any ideas on how we can improve the performance using v3.3.x (we'd really like to use the most current version of the database, but the performance difference would be a killer)? Thanks, Steve -- Steve Green SAVVIS Transforming Information Technology SM This message contains information which may be confidential and/or privileged. Unless you are the intended recipient (or authorized to receive for the intended recipient), you may not read, use, copy or disclose to anyone the message or any information contained in the message. If you have received the message in error, please advise the sender by reply e-mail at [EMAIL PROTECTED] and delete the message and any attachment(s) thereto without retaining any copies.