Re: [sqlite] Slow query, with correlated sub-sub-query
Keith, this definitely explains the observed time as it is relative to count(a)*count (ab)**2, thus non-linear. And a correlated sub-query is generally recalculated for each row. But I do not agree with everything. In my example it is correlated to the outermost query, and not to the sub-query in which it occurs. Theoretically the optimizer can take this into account and only recalculate for each row in the outermost query. And if I'm not mistaken Postgress does so. Below is a version modified for pgsql that runs fast no matter the number of rows. Thanks for the suggested change, where the minimum size is computed is a sub-query (not sub-sub) and joined to the other sub-query. This is so elegant. I still need to compare the timing to David's version and use the fastest. /* sudo -u postgres psql < issue2p.sql */ drop table if exists a ; drop table if exists ab ; create table a (a int, primary key (a)) ; create table ab (a int, b int, size int, primary key (a,b)) ; insert into a with recursive i as (select 1 as i union all select i+1 from i where i<3) select i from i ; insert into ab with recursive i as (select 1 as i union all select i+1 from i where i<1) select a, i as b, (a+i)%10 as size from a, i ; select a, ( select count(*) fromab where a=a.a and size=(select min(size) from ab where a=a.a) ) froma ; Keith Medcalf wrote: Well of course. You are aware that a correlated subquery means "for each candidate result execute the query"? So as you have formulated the query it means: for each row in a compute the result count which for each ab candidate row calculate whether it is the minimum which means that the you have requested that the same result be computed many times over. You have requested exampination of count(a) * count(ab) * count(ab) rows. Instead you should be computing the min(size) for each group of a once, and using that value in the correlated subquery select a.a, ( select count(*) from ab where a == a.a and size == b.size ) as acount from a, ( select a, min(size) as size from ab group by a ) as b where a.a == b.a; This will result in scanning count(ab) + count(a) * count(ab) rows. Which is significantly less. On my computer it reduces the execution time of the original query you posited from 400 ticks to less than 1 tick (ie, from 500 ms to <8 ms) I do not know if any optimizer can flatten you original query to any significant degree. Some optimizers may arrive at my fixed up query because they are capable of doing a hash table lookup on the result of the innermost correlate. SQLite does not do that, and without that capability I do not think there is a relational database query optimizer on the planet that can help you. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow query, with correlated sub-sub-query
Well of course. You are aware that a correlated subquery means "for each candidate result execute the query"? So as you have formulated the query it means: for each row in a compute the result count which for each ab candidate row calculate whether it is the minimum which means that the you have requested that the same result be computed many times over. You have requested exampination of count(a) * count(ab) * count(ab) rows. Instead you should be computing the min(size) for each group of a once, and using that value in the correlated subquery select a.a, ( select count(*) from ab where a == a.a and size == b.size ) as acount from a, ( select a, min(size) as size from ab group by a ) as b where a.a == b.a; This will result in scanning count(ab) + count(a) * count(ab) rows. Which is significantly less. On my computer it reduces the execution time of the original query you posited from 400 ticks to less than 1 tick (ie, from 500 ms to <8 ms) I do not know if any optimizer can flatten you original query to any significant degree. Some optimizers may arrive at my fixed up query because they are capable of doing a hash table lookup on the result of the innermost correlate. SQLite does not do that, and without that capability I do not think there is a relational database query optimizer on the planet that can help you. -- ˙uʍop-ǝpısdn sı ɹoʇıuoɯ ɹnoʎ 'sıɥʇ pɐǝɹ uɐɔ noʎ ɟı > -Original Message- > From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] > On Behalf Of E.Pasma > Sent: Friday, 7 July, 2017 07:47 > To: SQLite mailing list > Subject: [sqlite] Slow query, with correlated sub-sub-query > > Hello, below is a theoretical query that becomes slow when the number > of rows increases. What it does is: > - scan input cases in table a > - for each input case: > -- determine the smallest value of attribute size of elements in table > ab > -- count the number of elements having this smallest size > With 3 rows in table a and 3*1000 in ab this takes already several > seconds. > I'm not so much interested in an alternative solution, though > interesting, and merely want to show an inefficient construction. That > is a sub-sub-query correlated directly to the main query. > Thanks, E. Pasma > > .version > SQLite 3.19.3 2017-06-08 14:26:17 ... > > create table a (a, primary key (a)) > ; > create table ab (a, b, size, primary key (a,b)) > ; > insert into a > with i as (select 1 as i union all select i+1 from i where i<3) > select i from i > ; > insert into ab > with i as (select 1 as i union all select i+1 from i where i<1000) > select a, i as b, random()%10 as size from a, i > ; > .eqp on > .timer on > select a, > ( > select count(*) > fromab > where a=a.a > and size=(select min(size) from ab where a=a.a) > ) > froma > ; > --EQP-- 0,0,0,SCAN TABLE a > --EQP-- 0,0,0,EXECUTE CORRELATED SCALAR SUBQUERY 1 > --EQP-- 1,0,0,SEARCH TABLE ab USING INDEX sqlite_autoindex_ab_1 (a=?) > --EQP-- 1,0,0,EXECUTE CORRELATED SCALAR SUBQUERY 2 > --EQP-- 2,0,0,SEARCH TABLE ab USING INDEX sqlite_autoindex_ab_1 (a=?) > 1|56 > 2|53 > 3|49 > Run Time: real 2.678 user 2.597794 sys 0.008801 > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow query, with correlated sub-sub-query
Also of note is that when you make an index on ab (size), your original query, unchanged, becomes about 3 times faster than my modification. I'm not sure on what it looks at to decide if a temporary autoindex is worth it, but in this case it would have been. When you move back to your more complex version I don't know if an expression index would be practical though, you'll have to play around. -Original Message- From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of E.Pasma Sent: Friday, July 07, 2017 1:12 PM To: SQLite mailing list Subject: [sqlite] Slow query, with correlated sub-sub-query Thanks David for the alernative solution, that is 500 times faster or any times faster depending on the number of rows. I hope this does not derive attention from my point that the original construction is not very well delt with. It was only a theoretical query derived from a more complex one. The atrribute "size" is in this complex query yet an other sub-query, counting rows in yet another table. I need to see if the alternative solution is still faster there. E. Pasma ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Slow query, with correlated sub-sub-query
Thanks David for the alernative solution, that is 500 times faster or any times faster depending on the number of rows. I hope this does not derive attention from my point that the original construction is not very well delt with. It was only a theoretical query derived from a more complex one. The atrribute "size" is in this complex query yet an other sub-query, counting rows in yet another table. I need to see if the alternative solution is still faster there. E. Pasma ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow query, with correlated sub-sub-query
I acknowledge you said you weren't so much interested in an alternative solution, but... How about something like select a, min(size) as minSize, recCount from (select a, size, count(*) as recCount from a inner join ab using (a) group by a, size) group by a; The inner one will group by a and size, then the outer group by with the min() will pick the minimum and use that line to populate the bare column of recCount. With 10,000 here's your original: sqlite> select a, (select count(*) from ab where a = a.a and size = (select min(size) from ab where a = a.a)) from a; --EQP-- 0,0,0,SCAN TABLE a --EQP-- 0,0,0,EXECUTE CORRELATED SCALAR SUBQUERY 1 --EQP-- 1,0,0,SEARCH TABLE ab USING INDEX sqlite_autoindex_ab_1 (a=?) --EQP-- 1,0,0,EXECUTE CORRELATED SCALAR SUBQUERY 2 --EQP-- 2,0,0,SEARCH TABLE ab USING INDEX sqlite_autoindex_ab_1 (a=?) a|(select count(*) from ab where a = a.a and size = (select min(size) from ab where a = a.a)) 1|522 2|486 3|500 Memory Used: 975336 (max 1508448) bytes Number of Outstanding Allocations: 270 (max 326) Number of Pcache Overflow Bytes: 850880 (max 986272) bytes Number of Scratch Overflow Bytes:0 (max 12472) bytes Largest Allocation: 524288 bytes Largest Pcache Allocation: 4256 bytes Largest Scratch Allocation: 12472 bytes Lookaside Slots Used:35 (max 100) Successful lookaside attempts: 71296 Lookaside failures due to size: 19 Lookaside failures due to OOM: 119 Pager Heap Usage:844920 bytes Page cache hits: 2030205 Page cache misses: 0 Page cache writes: 0 Schema Heap Usage: 1472 bytes Statement Heap/Lookaside Usage: 32400 bytes Fullscan Steps: 2 Sort Operations: 0 Autoindex Inserts: 0 Virtual Machine Steps: 1800501558 Run Time: real 39.031 user 38.906649 sys 0.015600 And the alternative: sqlite> select a, min(size) as minSize, recCount from (select a, size, count(*) as recCount from a inner join ab using (a) group by a, size) group by a; --EQP-- 1,0,0,SCAN TABLE a USING COVERING INDEX sqlite_autoindex_a_1 --EQP-- 1,1,1,SEARCH TABLE ab USING INDEX sqlite_autoindex_ab_1 (a=?) --EQP-- 1,0,0,USE TEMP B-TREE FOR GROUP BY --EQP-- 0,0,0,SCAN SUBQUERY 1 --EQP-- 0,0,0,USE TEMP B-TREE FOR GROUP BY a|minSize|recCount 1|-9|522 2|-9|486 3|-9|500 Memory Used: 984136 (max 1513008) bytes Number of Outstanding Allocations: 280 (max 332) Number of Pcache Overflow Bytes: 855136 (max 986272) bytes Number of Scratch Overflow Bytes:0 (max 12472) bytes Largest Allocation: 524288 bytes Largest Pcache Allocation: 4256 bytes Largest Scratch Allocation: 12472 bytes Lookaside Slots Used:55 (max 100) Successful lookaside attempts: 102118 Lookaside failures due to size: 26 Lookaside failures due to OOM: 258 Pager Heap Usage:849164 bytes Page cache hits: 199 Page cache misses: 0 Page cache writes: 0 Schema Heap Usage: 1736 bytes Statement Heap/Lookaside Usage: 58280 bytes Fullscan Steps: 2 Sort Operations: 2 Autoindex Inserts: 0 Virtual Machine Steps: 511684 Run Time: real 0.063 user 0.015600 sys 0.00 -Original Message- From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of E.Pasma Sent: Friday, July 07, 2017 9:47 AM To: SQLite mailing list Subject: [sqlite] Slow query, with correlated sub-sub-query Hello, below is a theoretical query that becomes slow when the number of rows increases. What it does is: - scan input cases in table a - for each input case: -- determine the smallest value of attribute size of elements in table ab -- count the number of elements having this smallest size With 3 rows in table a and 3*1000 in ab this takes already several seconds. I'm not so much interested in an alternative solution, though interesting, and merely want to show an inefficient construction. That is a sub-sub-query correlated directly to the main query. Thanks, E. Pasma .version SQLite 3.19.3 2017-06-08 14:26:17 ... create table a (a, primary key (a)) ; create table ab (a, b, size, primary key (a,b)) ; insert into a with i as (select 1 as i union all select i+1 from i where i<3) select i from i ; insert into ab with i as (select 1 as i union all select i+1 from i where i<1000) select a, i as b, random()%10 as size from a, i ; .eqp on .timer on select a, ( select count(*) fromab where a=a.a and size=(select min(size) from ab where a=a.a) ) froma ; --EQP-- 0,0,0,SCAN TABLE a --EQP-- 0,0,0,EXECUTE C
[sqlite] Slow query, with correlated sub-sub-query
Hello, below is a theoretical query that becomes slow when the number of rows increases. What it does is: - scan input cases in table a - for each input case: -- determine the smallest value of attribute size of elements in table ab -- count the number of elements having this smallest size With 3 rows in table a and 3*1000 in ab this takes already several seconds. I'm not so much interested in an alternative solution, though interesting, and merely want to show an inefficient construction. That is a sub-sub-query correlated directly to the main query. Thanks, E. Pasma .version SQLite 3.19.3 2017-06-08 14:26:17 ... create table a (a, primary key (a)) ; create table ab (a, b, size, primary key (a,b)) ; insert into a with i as (select 1 as i union all select i+1 from i where i<3) select i from i ; insert into ab with i as (select 1 as i union all select i+1 from i where i<1000) select a, i as b, random()%10 as size from a, i ; .eqp on .timer on select a, ( select count(*) fromab where a=a.a and size=(select min(size) from ab where a=a.a) ) froma ; --EQP-- 0,0,0,SCAN TABLE a --EQP-- 0,0,0,EXECUTE CORRELATED SCALAR SUBQUERY 1 --EQP-- 1,0,0,SEARCH TABLE ab USING INDEX sqlite_autoindex_ab_1 (a=?) --EQP-- 1,0,0,EXECUTE CORRELATED SCALAR SUBQUERY 2 --EQP-- 2,0,0,SEARCH TABLE ab USING INDEX sqlite_autoindex_ab_1 (a=?) 1|56 2|53 3|49 Run Time: real 2.678 user 2.597794 sys 0.008801 ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow Query on large database Help
On Aug 8, 2013, at 3:40 PM, Christopher W. Steenwyk wrote: > This database is generated once, and then queried and interrogated multiple > times. So I am most concerned with read speed and not with writing or > updating. Ohohoho… in that case… I have some snake oil to sell you, Dear Sir! If your problem domain fits, I would suggest dimensional modeling: http://en.wikipedia.org/wiki/Dimensional_modeling The Kimball variety (ignore the Inmon heresy and associated charlatans): http://en.wikipedia.org/wiki/Ralph_Kimball Specifically a star schema (ignore anything reeking of snow flakes): http://en.wikipedia.org/wiki/Star_schema Follow the Kimball Scriptures to the letter: http://www.amazon.com/The-Data-Warehouse-Toolkit-Dimensional/dp/0471200247 (Until you have enough practical experience to break all the rules as you see fit, once you know better) Free advise… avoid the 'caterpillar syndrome'… a rookie deathtrap featuring an overly fat and wide fact table, with a multitude of teeny-weeny dimensions… http://en.wikipedia.org/wiki/Fact_table http://en.wikipedia.org/wiki/Dimension_table Analyze and correlate what belongs where… there is no silver bullet, but to understand your data first and foremost. May the Truth be with you. Go in peace and be merry. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow Query on large database Help
I just uploaded the output from sqlite3_analyze to dropbox. On Thu, Aug 8, 2013 at 9:40 AM, Christopher W. Steenwyk wrote: > Here is the data from stat1: > > "tbl", "idx", "stat" > "metrics", "metrics_idx", "68682102 2 2 2" > "metrics", "metrics_frame_idx", "68682102 2" > "metrics", "sqlite_autoindex_mobileye_ldw_metrics_1", "68682102 230 1" > "object_characteristics", "object_characteristics_idx", "1148344 164050 > 31899 1" > "object_characteristics", "sqlite_autoindex_object_characteristics_1", > "1148344 4 1" > "metadata", "metadata_type_value_idx", "510520 11873 2716 1 1" > "metadata", "metadata_idx", "510520 4 3 1 1" > "metadata", "sqlite_autoindex_metadata_1", "510520 4 3 1" > "attribute_values", "sqlite_autoindex_attribute_values_1", "198 1" > "attribute_types", "sqlite_autoindex_attribute_types_1", "50 1" > "frames", "frames_idx", "51896158 11492 4052 1" > "frames", "sqlite_autoindex_frames_1", "51896158 11492 4052 1" > "objects", "objects_grade_idx", "350060 14" > > I am working on running analyze. > > So, looking into EAV, I understand the drawbacks, but how would you > recommend me restructuring my schema to be more efficient? > > Perhaps have an "Objects" table that has a column per attribute, and then > a "frames" table that also contains the contents of metadata as separate > columns for each value? That would reduce me down to three tables. Or just > dump everything into one, massive table? > > This database is generated once, and then queried and interrogated > multiple times. So I am most concerned with read speed and not with writing > or updating. > > > On Wed, Aug 7, 2013 at 6:44 PM, Richard Hipp wrote: > >> On Wed, Aug 7, 2013 at 2:54 PM, Christopher W. Steenwyk < >> csteen...@gmail.com >> > wrote: >> >> > Hi, >> > >> > I have been working on a large database and its queries now for several >> > weeks and just can't figure out why my query is so slow. I've attached >> the >> > schema, my query, and the results of EXPLAIN QUERY from sqliteman. >> > >> > A few notes about this database: >> > Its approximately 10GB in size, but I have it on a SSD on a linux >> machine >> > with 12 GB of RAM on a 24 core PC. >> > >> > As for data in the tables... >> > 'Objects' has 35 rows >> > 'frames' has 51896158 rows >> > 'attribute_types' has 50 rows >> > 'attribute_values' has 200 rows >> > 'metrics' has 68682102 rows >> > >> > For every object there are approximately 5 rows that relate to it in >> > 'object_characteristics' and another 20 rows in 'metadata'. >> > >> > The attached query takes over 6 days to run. >> > >> > Any help or suggestions would be greatly appreciated. >> > >> >> Can you please send the output of ".dump sqlite_stat1". >> >> Also, if you can run sqlite3_analyzer on the database file and send us >> that, so much the better. >> >> Thanks. >> >> >> > >> > I noticed that the part of the query for 'frames' is not using a >> covering >> > index, cut I can't figure out why. I Was wondering if that is why it was >> > slow. In the query I am only referencing items that are within an index >> but >> > it keeps using the PRIMARY KEY. So that was one thought I had. >> > >> > Thanks in advance! >> > Chris >> > >> > ___ >> > sqlite-users mailing list >> > sqlite-users@sqlite.org >> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > >> > >> >> >> -- >> D. Richard Hipp >> d...@sqlite.org >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow Query on large database Help
Here is the data from stat1: "tbl", "idx", "stat" "metrics", "metrics_idx", "68682102 2 2 2" "metrics", "metrics_frame_idx", "68682102 2" "metrics", "sqlite_autoindex_mobileye_ldw_metrics_1", "68682102 230 1" "object_characteristics", "object_characteristics_idx", "1148344 164050 31899 1" "object_characteristics", "sqlite_autoindex_object_characteristics_1", "1148344 4 1" "metadata", "metadata_type_value_idx", "510520 11873 2716 1 1" "metadata", "metadata_idx", "510520 4 3 1 1" "metadata", "sqlite_autoindex_metadata_1", "510520 4 3 1" "attribute_values", "sqlite_autoindex_attribute_values_1", "198 1" "attribute_types", "sqlite_autoindex_attribute_types_1", "50 1" "frames", "frames_idx", "51896158 11492 4052 1" "frames", "sqlite_autoindex_frames_1", "51896158 11492 4052 1" "objects", "objects_grade_idx", "350060 14" I am working on running analyze. So, looking into EAV, I understand the drawbacks, but how would you recommend me restructuring my schema to be more efficient? Perhaps have an "Objects" table that has a column per attribute, and then a "frames" table that also contains the contents of metadata as separate columns for each value? That would reduce me down to three tables. Or just dump everything into one, massive table? This database is generated once, and then queried and interrogated multiple times. So I am most concerned with read speed and not with writing or updating. On Wed, Aug 7, 2013 at 6:44 PM, Richard Hipp wrote: > On Wed, Aug 7, 2013 at 2:54 PM, Christopher W. Steenwyk < > csteen...@gmail.com > > wrote: > > > Hi, > > > > I have been working on a large database and its queries now for several > > weeks and just can't figure out why my query is so slow. I've attached > the > > schema, my query, and the results of EXPLAIN QUERY from sqliteman. > > > > A few notes about this database: > > Its approximately 10GB in size, but I have it on a SSD on a linux machine > > with 12 GB of RAM on a 24 core PC. > > > > As for data in the tables... > > 'Objects' has 35 rows > > 'frames' has 51896158 rows > > 'attribute_types' has 50 rows > > 'attribute_values' has 200 rows > > 'metrics' has 68682102 rows > > > > For every object there are approximately 5 rows that relate to it in > > 'object_characteristics' and another 20 rows in 'metadata'. > > > > The attached query takes over 6 days to run. > > > > Any help or suggestions would be greatly appreciated. > > > > Can you please send the output of ".dump sqlite_stat1". > > Also, if you can run sqlite3_analyzer on the database file and send us > that, so much the better. > > Thanks. > > > > > > I noticed that the part of the query for 'frames' is not using a covering > > index, cut I can't figure out why. I Was wondering if that is why it was > > slow. In the query I am only referencing items that are within an index > but > > it keeps using the PRIMARY KEY. So that was one thought I had. > > > > Thanks in advance! > > Chris > > > > ___ > > sqlite-users mailing list > > sqlite-users@sqlite.org > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > > > > > -- > D. Richard Hipp > d...@sqlite.org > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow Query on large database Help
On Wed, 7 Aug 2013 23:13:41 +0200 Petite Abeille wrote: > On Aug 7, 2013, at 9:25 PM, Christopher W. Steenwyk > wrote: > > > Ah, sorry about the attachments, you can find the files here: > > https://www.dropbox.com/l/fpFfcjwwcyhXZgduswCggb > > Ah, also, your schema has a very, hmmm, Entity?attribute?value (EAV) > smell to it (object, attribute, types, values, characteristics, > metadata, etc, etc). Most likely not helping. Most assuredly not helping, because cannot help. What it does is make your queries vague and complex, and prevent the DBMS from doing its job (because you haven't expressed the rules of the data). ANALYZE is the last step on the road to good performance. The first step is analysis! --jkl ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow Query on large database Help
On Wed, Aug 7, 2013 at 2:54 PM, Christopher W. Steenwyk wrote: > Hi, > > I have been working on a large database and its queries now for several > weeks and just can't figure out why my query is so slow. I've attached the > schema, my query, and the results of EXPLAIN QUERY from sqliteman. > > A few notes about this database: > Its approximately 10GB in size, but I have it on a SSD on a linux machine > with 12 GB of RAM on a 24 core PC. > > As for data in the tables... > 'Objects' has 35 rows > 'frames' has 51896158 rows > 'attribute_types' has 50 rows > 'attribute_values' has 200 rows > 'metrics' has 68682102 rows > > For every object there are approximately 5 rows that relate to it in > 'object_characteristics' and another 20 rows in 'metadata'. > > The attached query takes over 6 days to run. > > Any help or suggestions would be greatly appreciated. > Can you please send the output of ".dump sqlite_stat1". Also, if you can run sqlite3_analyzer on the database file and send us that, so much the better. Thanks. > > I noticed that the part of the query for 'frames' is not using a covering > index, cut I can't figure out why. I Was wondering if that is why it was > slow. In the query I am only referencing items that are within an index but > it keeps using the PRIMARY KEY. So that was one thought I had. > > Thanks in advance! > Chris > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow Query on large database Help
On Aug 7, 2013, at 9:25 PM, Christopher W. Steenwyk wrote: > Ah, sorry about the attachments, you can find the files here: > https://www.dropbox.com/l/fpFfcjwwcyhXZgduswCggb Ah, also, your schema has a very, hmmm, Entity–attribute–value (EAV) smell to it (object, attribute, types, values, characteristics, metadata, etc, etc). Most likely not helping. Anyway, to each their own... ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow Query on large database Help
On Aug 7, 2013, at 9:25 PM, Christopher W. Steenwyk wrote: > Ah, sorry about the attachments, you can find the files here: > https://www.dropbox.com/l/fpFfcjwwcyhXZgduswCggb > > And yes, as the final part of the DB creation I do run ANALYZE. And I do > think the indexes are correct for the query. Wild, random stab in the dark… I suppose this is an equivalent query: selectmetadata.attribute_value_id as metadata_value_id, object_characteristics.attribute_value_id as object_type_value_id, sum( case when metrics.color = 1 and metrics.quality > 0 then 1 end ) as pass, count( metrics.id ) as total from metadata join frames onframes.id between metadata.start_frame_id and metadata.stop_frame_id join metrics onmetrics.frame_id = frames.id join object_characteristics onobject_characteristics.object_id = metrics.object_id where frames.session_frame_id > 12 and frames.ticks > 10 and exists ( select 1 fromattribute_types where attribute_types.id = metadata.attribute_type_id and attribute_types.type = 'Metadata Type' ) and exists ( select 1 fromattribute_types where attribute_types.id = object_characteristics.attribute_type_id and attribute_types.type = 'Object Type' ) group by metadata.attribute_value_id, object_characteristics.attribute_value_id explain query plan: 0|0|0|SCAN TABLE metadata USING COVERING INDEX metadata_type_value_idx (~50 rows) 0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 1 1|0|0|SEARCH TABLE attribute_types USING INTEGER PRIMARY KEY (rowid=?) (~1 rows) 0|1|2|SEARCH TABLE metrics USING INDEX metrics_idx (frame_id>? AND frame_idhttp://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow Query on large database Help
Ah, sorry about the attachments, you can find the files here: https://www.dropbox.com/l/fpFfcjwwcyhXZgduswCggb And yes, as the final part of the DB creation I do run ANALYZE. And I do think the indexes are correct for the query. On Wed, Aug 7, 2013 at 3:07 PM, Petite Abeille wrote: > > On Aug 7, 2013, at 8:54 PM, "Christopher W. Steenwyk" > wrote: > > > The attached query takes over 6 days to run. > > “Patience – A minor form of despair, disguised as a virtue.” > > Also… attachments are stripped out by the mailing list. You may want to > try to inline them instead. > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow Query on large database Help
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 07/08/13 11:54, Christopher W. Steenwyk wrote: > I have been working on a large database and its queries now for > several weeks and just can't figure out why my query is so slow. I've > attached the schema, my query, and the results of EXPLAIN QUERY from > sqliteman. Attachments get stripped from the mailing list. You can put them somewhere like Dropbox. Did you run analyze? Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAlICmtMACgkQmOOfHg372QSRLwCgkb0vJAQ/cnH+nr85W2PUJJrY U0kAoOPseXJlXtSqJw95tNgq1RUMHp37 =AvMF -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow Query on large database Help
On 7 Aug 2013, at 7:54pm, Christopher W. Steenwyk wrote: > I've attached the > schema, my query, and the results of EXPLAIN QUERY from sqliteman. Sorry, attachments don't work on this list (we don't want everyone sending us their homework). Could you instead just post your SELECT command and the EXPLAIN QUERY for it ? First, have you done an ANALYZE ? Second, do you have, or think you have, indexes appropriate to your query ? Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow Query on large database Help
On Aug 7, 2013, at 8:54 PM, "Christopher W. Steenwyk" wrote: > The attached query takes over 6 days to run. “Patience – A minor form of despair, disguised as a virtue.” Also… attachments are stripped out by the mailing list. You may want to try to inline them instead. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Slow Query on large database Help
Hi, I have been working on a large database and its queries now for several weeks and just can't figure out why my query is so slow. I've attached the schema, my query, and the results of EXPLAIN QUERY from sqliteman. A few notes about this database: Its approximately 10GB in size, but I have it on a SSD on a linux machine with 12 GB of RAM on a 24 core PC. As for data in the tables... 'Objects' has 35 rows 'frames' has 51896158 rows 'attribute_types' has 50 rows 'attribute_values' has 200 rows 'metrics' has 68682102 rows For every object there are approximately 5 rows that relate to it in 'object_characteristics' and another 20 rows in 'metadata'. The attached query takes over 6 days to run. Any help or suggestions would be greatly appreciated. I noticed that the part of the query for 'frames' is not using a covering index, cut I can't figure out why. I Was wondering if that is why it was slow. In the query I am only referencing items that are within an index but it keeps using the PRIMARY KEY. So that was one thought I had. Thanks in advance! Chris ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow query
If your table doesn't have a primary key, this look like your table aren't normalized, maybe you can try broke this table in 2 tables like: Your definition: CREATE TABLE table1(id INTEGER, name VARCHAR, id2 INTEGER, name2 VARCHAR, year INTEGER); indexes : index1( name ), index2( id2 ), index3( name2 ); Will be: CREATE TABLE t_names(id INTEGER PRIMARY KEY, name TEXT); CREATE INDEX t_name_idx_001 ON (name, id); CREATE TABLE t_relation(id1 INTEGER REFERENCES t_name(id), id2 INTEGER REFERENCES t_name(id), year INTEGER); CREATE INDEX t_relation_idx_001 ON (year, id1, id2); CREATE INDEX t_relation_idx_002 ON (id1, id2); Where t_relation make the relationship between the 'names' (t_name). And the select: SELECT id, name, id2, name2, max(year) y FROM table1 GROUP BY id, name2 ORDER BY name2, y DESC LIMIT 0, 15 Will be: SELECT t2.* , t3.* , MAX(t1.year) AS y FROM t_relation AS t1 JOIN t_name AS t2 ON (t2.id = t1.id1) JOIN t_name AS t3 ON (t3.id = t1.id1) GROUP BY t2.id, t3.name ORDER BY t3.name, y DESC LIMIT 0, 15 And: SELECT id, name, id2, name2 FROM table1 GROUP BY id ORDER BY name2, year DESC LIMIT 0,15 Will be: SELECT t2.* , t3.* FROM t_relation AS t1 JOIN t_name AS t2 ON (t2.id = t1.id1) JOIN t_name AS t3 ON (t3.id = t1.id1) GROUP BY t2.id ORDER BY t3.name, t1.year DESC LIMIT 0, 15 to export the existing data to the new tables you can do: INSERT OR REPLACE INTO t_name SELECT DISTINCT * FROM (SELECT id, name FROM table1 UNION ALL SELECT id2, name2 FROM table1); INSERT OR REPLACE INTO t_relation SELECT id, id2, year FROM table1; ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow query
> The year can also be different but Im interested only in the latest year. I > use the GROUP because I want unique id and if I add year in GROUP BY I can > get it repeated. If you're interested in the latest year then your query is totally wrong because it returns random year. If you saw what you want so far then you were just lucky. Try to change your query like this: SELECT id, name, id2, name2, max(year) y GROUP BY id, name2 ORDER BY name2, y DESC LIMIT 0, 15 Pavel On Wed, Jun 30, 2010 at 6:21 AM, J. Rios wrote: > On Wed, Jun 30, 2010 at 2:40 AM, Pavel Ivanov wrote: > >> > The query return the apropiate values as always the id -> id2 relation is >> > the same and id -> name and id2 -> name2. >> >> So your id maps uniquely to id2, name and name2. But what about year? >> What value of year do you want to be used in sorting? >> >> Anyway try to change query as "GROUP BY id, name2, year". If your id >> maps uniquely to the year it won't affect results and along with >> Simon's suggestion it should speed up your query. If your id doesn't >> map uniquely to year then you are sorting by random number, so you >> better remove that from ORDER BY clause. >> >> >> Pavel >> >> > There are many entries with the same id and id2 because there are other > fields with different data but I talked about the ones I need in this query. > The year can also be different but Im interested only in the latest year. I > use the GROUP because I want unique id and if I add year in GROUP BY I can > get it repeated. > > Guess my best option is to use a temp table. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow query
On Wed, Jun 30, 2010 at 2:40 AM, Pavel Ivanov wrote: > > The query return the apropiate values as always the id -> id2 relation is > > the same and id -> name and id2 -> name2. > > So your id maps uniquely to id2, name and name2. But what about year? > What value of year do you want to be used in sorting? > > Anyway try to change query as "GROUP BY id, name2, year". If your id > maps uniquely to the year it won't affect results and along with > Simon's suggestion it should speed up your query. If your id doesn't > map uniquely to year then you are sorting by random number, so you > better remove that from ORDER BY clause. > > > Pavel > > There are many entries with the same id and id2 because there are other fields with different data but I talked about the ones I need in this query. The year can also be different but Im interested only in the latest year. I use the GROUP because I want unique id and if I add year in GROUP BY I can get it repeated. Guess my best option is to use a temp table. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow query
> The query return the apropiate values as always the id -> id2 relation is > the same and id -> name and id2 -> name2. So your id maps uniquely to id2, name and name2. But what about year? What value of year do you want to be used in sorting? Anyway try to change query as "GROUP BY id, name2, year". If your id maps uniquely to the year it won't affect results and along with Simon's suggestion it should speed up your query. If your id doesn't map uniquely to year then you are sorting by random number, so you better remove that from ORDER BY clause. Pavel On Tue, Jun 29, 2010 at 7:02 PM, J. Rios wrote: > The query return the apropiate values as always the id -> id2 relation is > the same and id -> name and id2 -> name2. I keep them in the same table to > speed up other queries that are now very quick as uses indexes for the > ordering but in this SELECT the GROUP BY makes the difference and the SORT > is getting slow as its not using the index. I have read that sqlite only > uses one Index by query. > There must be a solution but I dont get it. > > Thanks in advance > > On Tue, Jun 29, 2010 at 9:49 AM, Pavel Ivanov wrote: > >> > SELECT id, name, id2, name2 GROUP BY id ORDER BY name2, year DESC LIMIT >> 0, >> > 15 >> > >> > How can I make it faster? >> >> First of all your query should return nonsense in any field except id. >> I bet it will also return different results (for the same ids) >> depending on what LIMIT clause you add or don't add it at all. >> >> But to make this particular query faster you should have an index on >> (name2, year). Note: index on both fields, not 2 different indexes on >> each field. >> >> >> Pavel >> >> On Tue, Jun 29, 2010 at 2:24 AM, J. Rios wrote: >> > I have the next table >> > >> > table1( id INTEGER, name VARCHAR, id2 INTEGER, name2 VARCHAR, year >> INTEGER >> > ); >> > >> > I have created the next indexes : index1( name ), index2( id2 ), index3( >> > name2 ); >> > >> > The database have about 200,000 records. >> > The next query takes about 2 seconds and I think its too much. >> > >> > SELECT id, name, id2, name2 GROUP BY id ORDER BY name2, year DESC LIMIT >> 0, >> > 15 >> > >> > If I remove the sorting condition the query is instantaneous. >> > >> > How can I make it faster? >> > >> > Thanks in advance >> > J.Rios >> > ___ >> > sqlite-users mailing list >> > sqlite-users@sqlite.org >> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow query
On 30 Jun 2010, at 12:32am, J. Rios wrote: > On Wed, Jun 30, 2010 at 12:13 AM, Simon Slavin wrote: >> >> A good index would be >> >> name2, year, id >> >> That's all three columns in the same index, not three separate indexes one >> on each column. > > I did the test and EXPLAIN QUERY PLAN tells its going to use still the id > index and not the new created one and still its too slow. Ooh. Interesting. I wonder which part of your SELECT is causing that. Perhaps the GROUP clause is changing how SQLite is working out what to do. Sorry for giving you useless information. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow query
On Wed, Jun 30, 2010 at 12:13 AM, Simon Slavin wrote: > > None of your indexes are much use for this SELECT command. Imagine trying > to do the SELECT command yourself and you'll quickly see that whichever of > the supplied indexes you use you're left trying to sort a great deal of > records by hand. A good index would be > > name2, year, id > > That's all three columns in the same index, not three separate indexes one > on each column. > > Simon. > I did the test and EXPLAIN QUERY PLAN tells its going to use still the id index and not the new created one and still its too slow. J.Rios ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow query
On 30 Jun 2010, at 12:05am, J. Rios wrote: > Sorry for posting on top of the message, No problem. Adding new text at the bottom of a post, and trimming what you quote just enough that people understand your new text, make your own message clear and encourage people to reply to it. Look at what I did: there's no point in quoting your .sig so I didn't. There's no point in quoting the post you're replying to so I didn't. I chopped both those pieces out. > This is my original post with the query you request > > I have the next table > > table1( id INTEGER, name VARCHAR, id2 INTEGER, name2 VARCHAR, year INTEGER > ); > > I have created the next indexes : index1( name ), index2( id2 ), index3( > name2 ); > > The database have about 200,000 records. > The next query takes about 2 seconds and I think its too much. Agreed. > SELECT id, name, id2, name2 GROUP BY id ORDER BY name2, year DESC LIMIT 0, > 15 None of your indexes are much use for this SELECT command. Imagine trying to do the SELECT command yourself and you'll quickly see that whichever of the supplied indexes you use you're left trying to sort a great deal of records by hand. A good index would be name2, year, id That's all three columns in the same index, not three separate indexes one on each column. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow query
On Wed, Jun 30, 2010 at 12:02 AM, Simon Slavin wrote: > > On 29 Jun 2010, at 11:57pm, J. Rios wrote: > > >>> I have created the next indexes : index1( name ), index2( id2 ), > index3( > >>> name2 ); > > Those are very unlikely to be of any use at all. They're probably a waste > of space. > > > Its not the primary Key. There are more fields but the index on id is > > created also. Sorry I missed it in the post. > > > > If I do a EXPLAIN QUERY PLAN sqlite tells me its going to use the id > INDEX. > > But the sorting is slow. > > Please post your SELECT command again. It seems like none of the indexes > you've added are being used. And please add your new text /below/ the parts > of other posts that you quote. English is read from top to bottom. > > Simon. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > Sorry for posting on top of the message, This is my original post with the query you request I have the next table table1( id INTEGER, name VARCHAR, id2 INTEGER, name2 VARCHAR, year INTEGER ); I have created the next indexes : index1( name ), index2( id2 ), index3( name2 ); The database have about 200,000 records. The next query takes about 2 seconds and I think its too much. SELECT id, name, id2, name2 GROUP BY id ORDER BY name2, year DESC LIMIT 0, 15 If I remove the sorting condition the query is instantaneous. How can I make it faster? Thanks in advance J.Rios ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow query
The query return the apropiate values as always the id -> id2 relation is the same and id -> name and id2 -> name2. I keep them in the same table to speed up other queries that are now very quick as uses indexes for the ordering but in this SELECT the GROUP BY makes the difference and the SORT is getting slow as its not using the index. I have read that sqlite only uses one Index by query. There must be a solution but I dont get it. Thanks in advance On Tue, Jun 29, 2010 at 9:49 AM, Pavel Ivanov wrote: > > SELECT id, name, id2, name2 GROUP BY id ORDER BY name2, year DESC LIMIT > 0, > > 15 > > > > How can I make it faster? > > First of all your query should return nonsense in any field except id. > I bet it will also return different results (for the same ids) > depending on what LIMIT clause you add or don't add it at all. > > But to make this particular query faster you should have an index on > (name2, year). Note: index on both fields, not 2 different indexes on > each field. > > > Pavel > > On Tue, Jun 29, 2010 at 2:24 AM, J. Rios wrote: > > I have the next table > > > > table1( id INTEGER, name VARCHAR, id2 INTEGER, name2 VARCHAR, year > INTEGER > > ); > > > > I have created the next indexes : index1( name ), index2( id2 ), index3( > > name2 ); > > > > The database have about 200,000 records. > > The next query takes about 2 seconds and I think its too much. > > > > SELECT id, name, id2, name2 GROUP BY id ORDER BY name2, year DESC LIMIT > 0, > > 15 > > > > If I remove the sorting condition the query is instantaneous. > > > > How can I make it faster? > > > > Thanks in advance > > J.Rios > > ___ > > sqlite-users mailing list > > sqlite-users@sqlite.org > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow query
On 29 Jun 2010, at 11:57pm, J. Rios wrote: >>> I have created the next indexes : index1( name ), index2( id2 ), index3( >>> name2 ); Those are very unlikely to be of any use at all. They're probably a waste of space. > Its not the primary Key. There are more fields but the index on id is > created also. Sorry I missed it in the post. > > If I do a EXPLAIN QUERY PLAN sqlite tells me its going to use the id INDEX. > But the sorting is slow. Please post your SELECT command again. It seems like none of the indexes you've added are being used. And please add your new text /below/ the parts of other posts that you quote. English is read from top to bottom. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow query
Its not the primary Key. There are more fields but the index on id is created also. Sorry I missed it in the post. If I do a EXPLAIN QUERY PLAN sqlite tells me its going to use the id INDEX. But the sorting is slow. Thanks in advance On Tue, Jun 29, 2010 at 5:32 PM, Jim Morris wrote: > You also haven't specified a primary key, i.e. on "id" > > On 6/28/2010 11:24 PM, J. Rios wrote: > > I have the next table > > > > table1( id INTEGER, name VARCHAR, id2 INTEGER, name2 VARCHAR, year > INTEGER > > ); > > > > I have created the next indexes : index1( name ), index2( id2 ), index3( > > name2 ); > > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow query
You also haven't specified a primary key, i.e. on "id" On 6/28/2010 11:24 PM, J. Rios wrote: > I have the next table > > table1( id INTEGER, name VARCHAR, id2 INTEGER, name2 VARCHAR, year INTEGER > ); > > I have created the next indexes : index1( name ), index2( id2 ), index3( > name2 ); > > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow query
> SELECT id, name, id2, name2 GROUP BY id ORDER BY name2, year DESC LIMIT 0, > 15 > > How can I make it faster? First of all your query should return nonsense in any field except id. I bet it will also return different results (for the same ids) depending on what LIMIT clause you add or don't add it at all. But to make this particular query faster you should have an index on (name2, year). Note: index on both fields, not 2 different indexes on each field. Pavel On Tue, Jun 29, 2010 at 2:24 AM, J. Rios wrote: > I have the next table > > table1( id INTEGER, name VARCHAR, id2 INTEGER, name2 VARCHAR, year INTEGER > ); > > I have created the next indexes : index1( name ), index2( id2 ), index3( > name2 ); > > The database have about 200,000 records. > The next query takes about 2 seconds and I think its too much. > > SELECT id, name, id2, name2 GROUP BY id ORDER BY name2, year DESC LIMIT 0, > 15 > > If I remove the sorting condition the query is instantaneous. > > How can I make it faster? > > Thanks in advance > J.Rios > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Slow query
I have the next table table1( id INTEGER, name VARCHAR, id2 INTEGER, name2 VARCHAR, year INTEGER ); I have created the next indexes : index1( name ), index2( id2 ), index3( name2 ); The database have about 200,000 records. The next query takes about 2 seconds and I think its too much. SELECT id, name, id2, name2 GROUP BY id ORDER BY name2, year DESC LIMIT 0, 15 If I remove the sorting condition the query is instantaneous. How can I make it faster? Thanks in advance J.Rios ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow Query with LEFT OUTER JOIN
On Tue, 23 Sep 2008 14:37:11 -0400, Enrique Ramirez wrote: >On Tue, Sep 23, 2008 at 1:16 PM, Kees Nuyt <[EMAIL PROTECTED]> wrote: >> Steps to take (you need all of them, except 1): >> >> 1) Use v6.2.3 >> > >Probably meant to say 1) Use v3.6.2? Oops, yes. Or even better: v3.6.3 -- ( Kees Nuyt ) c[_] ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow Query with LEFT OUTER JOIN
On Tue, Sep 23, 2008 at 1:16 PM, Kees Nuyt <[EMAIL PROTECTED]> wrote: > Steps to take (you need all of them, except 1): > > 1) Use v6.2.3 > Probably meant to say 1) Use v3.6.2? -- // -- Enrique Ramirez Irizarry Lead Developer Indie Code Labs http://www.indiecodelabs.com ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slow Query with LEFT OUTER JOIN
On Mon, 22 Sep 2008 13:48:42 -0700, Jason wrote: >Hello everyone, > >Hoping that I could get some help with a performance problem. >Using version 3.5.2 > >Here are the tables: >CREATE TABLE Objects (ObjectId INTEGER PRIMARY KEY, Name TEXT) >CREATE TABLE Keywords4Objects (ObjectId INTEGER, KeywordId INTEGER) >CREATE TABLE Keywords (KeywordId INTEGER PRIMARY KEY, Keyword TEXT NOT NULL >COLLATE NOCASE UNIQUE) > >The Query: >SELECT DISTINCT o.ObjectId, o.Name >FROM Objects o >LEFT OUTER JOIN Keywords4Objects ok ON ok.ObjectId = >o.ObjectId >LEFT OUTER JOIN Keywords k ON k.KeywordId = ok.KeywordId >WHERE >k.Keyword LIKE 'abc' OR o.Name LIKE 'abc' > >Initially, I was just using a join, but objects without keywords were not >getting picked up. When I switched to Left Outer Joins. I got the objects >that don't have keywords, but the query is unbearably slow. It went from >being instantaneous to over a minute. > >Any ideas? The only reason you need to use LEFT OUTER JOINs here is the ``OR o.Name LIKE 'abc'`` construct. LEFT OUTER JOIN makes your intermediate result sets much too large. Also, you use LIKE with an search expression without any wildcard characters. LIKE doesn't use indexes, = does. Warning: I have no experience with COLLATE NOCASE, so I'll ignore that in my 'solution'. Steps to take (you need all of them, except 1): 1) Use v6.2.3 2) Load Keywords.Keyword with all lower case. Now you can use k.Keyword = 'abc'. 3) Load Objects.Name with all lower case, or add an extra column Objects.Keyword as a lower case version of Objects.Name if you need to keep the original case in Objects.Name. Now you can use o.Name = 'abc'. 4) Add an index on Objects.Keyword (not unique, of course). 5) Use INNER JOIN to collect matches via Keywords4Objects 6) Add the matches on Objects.Name (or Objects.Keyword) with a UNION. 7) Wrap the SELECT ... JOIN UNION SELECT ... in a SELECT DISTINCT to get unique results. 8) You probably don't use 'abc' but some variable in your host language. Due to 2) and 3) you will have to lower case your search expression as well. (untested). >Thanks in advance. HTH -- ( Kees Nuyt ) c[_] ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Slow Query with LEFT OUTER JOIN
Hello everyone, Hoping that I could get some help with a performance problem. Using version 3.5.2 Here are the tables: CREATE TABLE Objects (ObjectId INTEGER PRIMARY KEY, Name TEXT) CREATE TABLE Keywords4Objects (ObjectId INTEGER, KeywordId INTEGER) CREATE TABLE Keywords (KeywordId INTEGER PRIMARY KEY, Keyword TEXT NOT NULL COLLATE NOCASE UNIQUE) The Query: SELECT DISTINCT o.ObjectId, o.Name FROM Objects o LEFT OUTER JOIN Keywords4Objects ok ON ok.ObjectId = o.ObjectId LEFT OUTER JOIN Keywords k ON k.KeywordId = ok.KeywordId WHERE k.Keyword LIKE 'abc' OR o.Name LIKE 'abc' Initially, I was just using a join, but objects without keywords were not getting picked up. When I switched to Left Outer Joins. I got the objects that don't have keywords, but the query is unbearably slow. It went from being instantaneous to over a minute. Any ideas? Thanks in advance. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
RE: [sqlite] Slow query on one machine
I can't do anything like that as it is a machine of a customer and I have no access to it. Just wondering what possibly could explain such a difference. The relevant thing is that are load of queries coming after this and they are all fine. So it specific to that particular table at that particular point on that particular machine, all else behaves normal. RBS -Original Message- From: Jay Sprenkle [mailto:[EMAIL PROTECTED] Sent: 19 January 2008 14:12 To: sqlite-users@sqlite.org Subject: Re: [sqlite] Slow query on one machine On Jan 18, 2008 3:32 PM, RB Smissaert <[EMAIL PROTECTED]> wrote: > The application > that runs this is exactly the same on both machines. The slow machine is > actually slightly slower specification wise, but that can't explain the huge > differences in timings. > Have you run spinrite ( a disk diagnostic/maintenance program ) on the slow machine? - To unsubscribe, send email to [EMAIL PROTECTED] - - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Slow query on one machine
On Jan 18, 2008 3:32 PM, RB Smissaert <[EMAIL PROTECTED]> wrote: > The application > that runs this is exactly the same on both machines. The slow machine is > actually slightly slower specification wise, but that can't explain the huge > differences in timings. > Have you run spinrite ( a disk diagnostic/maintenance program ) on the slow machine? - To unsubscribe, send email to [EMAIL PROTECTED] -
[sqlite] Slow query on one machine
Trying to figure out why (with one particular customer) some queries have very different timings on one machine compared to another machine. It has to do with updating one particular SQLite table with more recent data from an Interbase table. I give the database (S for SQLite and I for Interbase), SQL query and time in seconds of slow machine and fast machine: DB|Query|slow PC|fast PC - S|SELECT MAX(ENTRY_ID) FROM ENTRY_ATTRIBUTES|0.26|8.46 I|SELECT ENTRY_ID, ATTRIBUTE_TYPE, TYPE_SPECIFIC_INFO, NUMERIC_VALUE FROM ENTRY_ATTRIBUTE WHERE ENTRY_ID > 15085882|0.05|0.07 S|INSERT OR IGNORE INTO ENTRY_ATTRIBUTES VALUES(?,?,?,?)|7.51|0.14 S|analyze ENTRY_ATTRIBUTES|431.96|0.03 All I can think of is that somehow there is something wrong with the SQLite table ENTRY_ATTRIBUTES on the slow machine and I suggested dropping that table and making a new one, but that made no difference. The application that runs this is exactly the same on both machines. The slow machine is actually slightly slower specification wise, but that can't explain the huge differences in timings. I think the only way for me to find is to get hold of that SQLite file, but for now having some trouble getting hold of this file. Any ideas? RBS - To unsubscribe, send email to [EMAIL PROTECTED] -
RE: [sqlite] slow query
Ignore this as I found out what the trouble was. I dropped 2 indices from the table ADDRESSLINK as they were on fields where the values were nearly all the same. Forgot now what the technical term for it is, but it solved it all and now very fast. Great software this SQLite! RBS -Original Message- From: RB Smissaert [mailto:[EMAIL PROTECTED] Sent: 26 November 2006 10:04 To: sqlite-users@sqlite.org Subject: [sqlite] slow query Although SQLite seems very fast, I now have come across a query that runs extremely slow and I would be interested why this is: There are 3 tables involved, all fairly small, some 25000 rows: PATIENT, ADDRESS and ADDRESSLINK All the relevant fields are indexed. This is the query: SELECT P.PATIENT_ID, P.OLD_EXTERNAL_NO, P.FORENAME_1, P.SURNAME, P.DATE_OF_BIRTH, P.GENDER_TYPE, A.ADDRESS_LINE_2, P.REGISTERED_GP FROM PATIENT P LEFT JOIN ADDRESSLINK AL ON (P.PATIENT_ID = AL.FOREIGN_ID AND AL.ADDRESS_TYPE = 1 AND AL.ADDRESS_LINK_TYPE = 2) LEFT JOIN ADDRESS A ON (A.ADDRESS_ID = AL.ADDRESS_ID) WHERE P.MAIN_REG_TYPE = 1 AND A.ADDRESS_LINE_1 LIKE '%ave%' ORDER BY P.PATIENT_ID ASC These are the table definitions, where the numbers are the rootpage: table ADDRESS ADDRESS 2 CREATE TABLE [ADDRESS] ([ADDRESS_ID] INTEGER, [ADDRESS_LINE_1] TEXT, [POSTCODE] TEXT, [UPDATED_DATE] INTEGER, [ADDRESS_LINE_2] TEXT, [ADDRESS_LINE_3] TEXT, [ADDRESS_LINE_4] TEXT, [ADDRESS_LINE_5] TEXT) index IDX0ADDRESS ADDRESS 312 CREATE UNIQUE INDEX IDX0ADDRESS ON ADDRESS(ADDRESS_ID) index IDX1ADDRESS ADDRESS 362 CREATE INDEX IDX1ADDRESS ON ADDRESS(ADDRESS_LINE_1) index IDX2ADDRESS ADDRESS 411 CREATE INDEX IDX2ADDRESS ON ADDRESS(POSTCODE) index IDX3ADDRESS ADDRESS 493 CREATE INDEX IDX3ADDRESS ON ADDRESS(UPDATED_DATE) index IDX4ADDRESS ADDRESS 551 CREATE INDEX IDX4ADDRESS ON ADDRESS(ADDRESS_LINE_2) index IDX5ADDRESS ADDRESS 665 CREATE INDEX IDX5ADDRESS ON ADDRESS(ADDRESS_LINE_3) index IDX6ADDRESS ADDRESS 753 CREATE INDEX IDX6ADDRESS ON ADDRESS(ADDRESS_LINE_4) index IDX7ADDRESS ADDRESS 856 CREATE INDEX IDX7ADDRESS ON ADDRESS(ADDRESS_LINE_5) table ADDRESSLINK ADDRESSLINK 899 CREATE TABLE [ADDRESSLINK] ([ADDRESS_LINK_ID] INTEGER, [ADDRESS_ID] INTEGER, [FOREIGN_ID] INTEGER, [ADDRESS_TYPE] INTEGER, [ADDRESS_LINK_TYPE] INTEGER) index IDX0ADDRESSLINK ADDRESSLINK 1033CREATE UNIQUE INDEX IDX0ADDRESSLINK ON ADDRESSLINK(ADDRESS_LINK_ID) index IDX1ADDRESSLINK ADDRESSLINK CREATE INDEX IDX1ADDRESSLINK ON ADDRESSLINK(ADDRESS_ID) index IDX2ADDRESSLINK ADDRESSLINK 1184CREATE INDEX IDX2ADDRESSLINK ON ADDRESSLINK(FOREIGN_ID) index IDX3ADDRESSLINK ADDRESSLINK 1253CREATE INDEX IDX3ADDRESSLINK ON ADDRESSLINK(ADDRESS_TYPE) index IDX4ADDRESSLINK ADDRESSLINK 1319CREATE INDEX IDX4ADDRESSLINK ON ADDRESSLINK(ADDRESS_LINK_TYPE) table PATIENT PATIENT 313541 CREATE TABLE [PATIENT] ([PATIENT_ID] INTEGER, [USUALLY_SEEN_GP] INTEGER, [ADDED_BY] INTEGER, [UPDATED_BY] INTEGER, [FHSA_ID] INTEGER, [DHA_ID] INTEGER, [REGISTERED_GP] INTEGER, [ROUTE_MARKER_TYPE] INTEGER, [MAIN_REG_TYPE] INTEGER, [GENDER_TYPE] INTEGER, [SURGERY_ID] INTEGER, [ACCEPTANCE_TYPE] INTEGER, [RESIDENTIAL_INSTITUTE] INTEGER, [QUALIFIER_REG_TYPE] INTEGER, [CHI_NUMBER] TEXT, [DATE_OF_BIRTH] INTEGER, [FORENAME_1] TEXT, [FORENAME_2] TEXT, [NHS_NUMBER] TEXT, [SURNAME] TEXT, [DISPENSING_FLAG] INTEGER, [ALIAS] TEXT, [RURAL_MILEAGE] INTEGER, [WALKING_UNITS] INTEGER, [PREVIOUS_NHS_NUMBER] TEXT, [PREVIOUS_SURNAME] TEXT, [UPDATED_DATE] INTEGER, [PREVIOUS_DATE_OF_BIRTH] INTEGER, [MARITAL_STATUS_TYPE] INTEGER, [DATE_OF_DEATH_ACCURACY] INTEGER, [DATE_OF_DEATH] INTEGER, [TITLE_TEXT] TEXT, [REGISTRATION_SOURCE] INTEGER, [REGISTRATION_START_DATE] INTEGER, [REGISTRATION_REMOVAL_SOURCE] INTEGER, [REGISTRATION_REMOVAL_REASON] INTEGER, [REGISTRATION_ACCEPTANCE_DATE] INTEGER, [REGISTRATION_REMOVAL_DATE] INTEGER, [REGISTRATION_CONFIRMED] INTEGER, [REGISTRATION_END_DATE] INTEGER, [SECURE_FLAG] INTEGER, [DORMANT_FLAG] INTEGER, [MEDICAL_RECORDS_FLAG] INTEGER, [MRF_DATE] INTEGER, [LINK_STATUS] INTEGER, [ACCEPTANCE_TRANS_NO] INTEGER, [DEDUCTION_TRANS_NO] INTEGER, [OLD_INTERNAL_NO] INTEGER, [OLD_EXTERNAL_NO] TEXT, [STAFF_MEMBER_FLAG] INTEGER, [PREVIOUS_SURNAME2] TEXT, [MOD_OLD_SERVICE_NO] TEXT, [PREVIOUS_CHI_NUMBER] TEXT, [FOOTPATH_MILES] TEXT, [WATER_MILES] TEXT, [BIRTH_SURNAME] TEXT, [OTHER_FORENAMES] TEXT, [SCRIPT_DISPOSAL_DEST] INTEGER, [FP69_STATUS] INTEGER, [FP69_DATE_AMENDED] INTEGER, [FP69_DATE_DEDUCTED] INTEGER, [MRF_FORWARDED_DATE] INTEGER, [DEDUCTION_REQUESTED] INTEGER, [FP69_EXPIRY_DATE] INTEGER, [FP69_REASON] INTEGER, [FP69_FHSA_NOTES] TEXT, [RESPONSIBLE_HA] INTEGER, [CARER_FLAG] INTEGER, [DEDUCTION_PENDING] INTEGER, [BULK_TRANSFER_DATE] INTEGER) index IDX0PATIENT PATIENT 314906 CREATE UNIQUE INDEX IDX0PATIENT ON PATIENT(PATIENT_ID) index IDX1PATIENT PATIENT 314979 CREATE INDEX IDX1PATIENT ON PATIENT(USUAL
[sqlite] slow query
Although SQLite seems very fast, I now have come across a query that runs extremely slow and I would be interested why this is: There are 3 tables involved, all fairly small, some 25000 rows: PATIENT, ADDRESS and ADDRESSLINK All the relevant fields are indexed. This is the query: SELECT P.PATIENT_ID, P.OLD_EXTERNAL_NO, P.FORENAME_1, P.SURNAME, P.DATE_OF_BIRTH, P.GENDER_TYPE, A.ADDRESS_LINE_2, P.REGISTERED_GP FROM PATIENT P LEFT JOIN ADDRESSLINK AL ON (P.PATIENT_ID = AL.FOREIGN_ID AND AL.ADDRESS_TYPE = 1 AND AL.ADDRESS_LINK_TYPE = 2) LEFT JOIN ADDRESS A ON (A.ADDRESS_ID = AL.ADDRESS_ID) WHERE P.MAIN_REG_TYPE = 1 AND A.ADDRESS_LINE_1 LIKE '%ave%' ORDER BY P.PATIENT_ID ASC These are the table definitions, where the numbers are the rootpage: table ADDRESS ADDRESS 2 CREATE TABLE [ADDRESS] ([ADDRESS_ID] INTEGER, [ADDRESS_LINE_1] TEXT, [POSTCODE] TEXT, [UPDATED_DATE] INTEGER, [ADDRESS_LINE_2] TEXT, [ADDRESS_LINE_3] TEXT, [ADDRESS_LINE_4] TEXT, [ADDRESS_LINE_5] TEXT) index IDX0ADDRESS ADDRESS 312 CREATE UNIQUE INDEX IDX0ADDRESS ON ADDRESS(ADDRESS_ID) index IDX1ADDRESS ADDRESS 362 CREATE INDEX IDX1ADDRESS ON ADDRESS(ADDRESS_LINE_1) index IDX2ADDRESS ADDRESS 411 CREATE INDEX IDX2ADDRESS ON ADDRESS(POSTCODE) index IDX3ADDRESS ADDRESS 493 CREATE INDEX IDX3ADDRESS ON ADDRESS(UPDATED_DATE) index IDX4ADDRESS ADDRESS 551 CREATE INDEX IDX4ADDRESS ON ADDRESS(ADDRESS_LINE_2) index IDX5ADDRESS ADDRESS 665 CREATE INDEX IDX5ADDRESS ON ADDRESS(ADDRESS_LINE_3) index IDX6ADDRESS ADDRESS 753 CREATE INDEX IDX6ADDRESS ON ADDRESS(ADDRESS_LINE_4) index IDX7ADDRESS ADDRESS 856 CREATE INDEX IDX7ADDRESS ON ADDRESS(ADDRESS_LINE_5) table ADDRESSLINK ADDRESSLINK 899 CREATE TABLE [ADDRESSLINK] ([ADDRESS_LINK_ID] INTEGER, [ADDRESS_ID] INTEGER, [FOREIGN_ID] INTEGER, [ADDRESS_TYPE] INTEGER, [ADDRESS_LINK_TYPE] INTEGER) index IDX0ADDRESSLINK ADDRESSLINK 1033CREATE UNIQUE INDEX IDX0ADDRESSLINK ON ADDRESSLINK(ADDRESS_LINK_ID) index IDX1ADDRESSLINK ADDRESSLINK CREATE INDEX IDX1ADDRESSLINK ON ADDRESSLINK(ADDRESS_ID) index IDX2ADDRESSLINK ADDRESSLINK 1184CREATE INDEX IDX2ADDRESSLINK ON ADDRESSLINK(FOREIGN_ID) index IDX3ADDRESSLINK ADDRESSLINK 1253CREATE INDEX IDX3ADDRESSLINK ON ADDRESSLINK(ADDRESS_TYPE) index IDX4ADDRESSLINK ADDRESSLINK 1319CREATE INDEX IDX4ADDRESSLINK ON ADDRESSLINK(ADDRESS_LINK_TYPE) table PATIENT PATIENT 313541 CREATE TABLE [PATIENT] ([PATIENT_ID] INTEGER, [USUALLY_SEEN_GP] INTEGER, [ADDED_BY] INTEGER, [UPDATED_BY] INTEGER, [FHSA_ID] INTEGER, [DHA_ID] INTEGER, [REGISTERED_GP] INTEGER, [ROUTE_MARKER_TYPE] INTEGER, [MAIN_REG_TYPE] INTEGER, [GENDER_TYPE] INTEGER, [SURGERY_ID] INTEGER, [ACCEPTANCE_TYPE] INTEGER, [RESIDENTIAL_INSTITUTE] INTEGER, [QUALIFIER_REG_TYPE] INTEGER, [CHI_NUMBER] TEXT, [DATE_OF_BIRTH] INTEGER, [FORENAME_1] TEXT, [FORENAME_2] TEXT, [NHS_NUMBER] TEXT, [SURNAME] TEXT, [DISPENSING_FLAG] INTEGER, [ALIAS] TEXT, [RURAL_MILEAGE] INTEGER, [WALKING_UNITS] INTEGER, [PREVIOUS_NHS_NUMBER] TEXT, [PREVIOUS_SURNAME] TEXT, [UPDATED_DATE] INTEGER, [PREVIOUS_DATE_OF_BIRTH] INTEGER, [MARITAL_STATUS_TYPE] INTEGER, [DATE_OF_DEATH_ACCURACY] INTEGER, [DATE_OF_DEATH] INTEGER, [TITLE_TEXT] TEXT, [REGISTRATION_SOURCE] INTEGER, [REGISTRATION_START_DATE] INTEGER, [REGISTRATION_REMOVAL_SOURCE] INTEGER, [REGISTRATION_REMOVAL_REASON] INTEGER, [REGISTRATION_ACCEPTANCE_DATE] INTEGER, [REGISTRATION_REMOVAL_DATE] INTEGER, [REGISTRATION_CONFIRMED] INTEGER, [REGISTRATION_END_DATE] INTEGER, [SECURE_FLAG] INTEGER, [DORMANT_FLAG] INTEGER, [MEDICAL_RECORDS_FLAG] INTEGER, [MRF_DATE] INTEGER, [LINK_STATUS] INTEGER, [ACCEPTANCE_TRANS_NO] INTEGER, [DEDUCTION_TRANS_NO] INTEGER, [OLD_INTERNAL_NO] INTEGER, [OLD_EXTERNAL_NO] TEXT, [STAFF_MEMBER_FLAG] INTEGER, [PREVIOUS_SURNAME2] TEXT, [MOD_OLD_SERVICE_NO] TEXT, [PREVIOUS_CHI_NUMBER] TEXT, [FOOTPATH_MILES] TEXT, [WATER_MILES] TEXT, [BIRTH_SURNAME] TEXT, [OTHER_FORENAMES] TEXT, [SCRIPT_DISPOSAL_DEST] INTEGER, [FP69_STATUS] INTEGER, [FP69_DATE_AMENDED] INTEGER, [FP69_DATE_DEDUCTED] INTEGER, [MRF_FORWARDED_DATE] INTEGER, [DEDUCTION_REQUESTED] INTEGER, [FP69_EXPIRY_DATE] INTEGER, [FP69_REASON] INTEGER, [FP69_FHSA_NOTES] TEXT, [RESPONSIBLE_HA] INTEGER, [CARER_FLAG] INTEGER, [DEDUCTION_PENDING] INTEGER, [BULK_TRANSFER_DATE] INTEGER) index IDX0PATIENT PATIENT 314906 CREATE UNIQUE INDEX IDX0PATIENT ON PATIENT(PATIENT_ID) index IDX1PATIENT PATIENT 314979 CREATE INDEX IDX1PATIENT ON PATIENT(USUALLY_SEEN_GP) index IDX2PATIENT PATIENT 315044 CREATE INDEX IDX2PATIENT ON PATIENT(REGISTERED_GP) index IDX3PATIENT PATIENT 315109 CREATE INDEX IDX3PATIENT ON PATIENT(MAIN_REG_TYPE) index IDX4PATIENT PATIENT 315173 CREATE INDEX IDX4PATIENT ON PATIENT(DATE_OF_BIRTH) index IDX5PATIENT PATIENT 315255 CREATE INDEX IDX5PATIENT ON PATIENT(NHS_NUMBER) index IDX6PATIENT PATIENT 315370 CREATE INDEX IDX6PATIENT ON PATIENT(UPDATED_D
Re: [sqlite] Slow query after reboot
> What is the name of your database (with extension please) ? > There are chances that such extension is registered into Windows XP > crap called System Restore and each time on start it detect that this > file is changed and create a restore snapshot for it. List given at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/sr/sr/mon itored_file_extensions.asp Hugh
Re: [sqlite] Slow query after reboot
Geoff Simonds wrote: Thanks to everyone for all the help on this problem. I am going to try creating a new thread to touch the tables at startup. Chris Schirlinger wrote: We have the same issue, to get around it we fire a thread when the program starts, intelligently "touching" every table that the user is likely to access (As Michael Sizaki already mentioned a select count(last_column) from big_table; will do it) Since a user is very unlikely to run a program and start doing anything complex within the first 10 to 20 seconds, by the time they DO try and run a query, it is the "second time" the query has been run and this we get the benefit of the Windows XP cache and thus the fast speed I have created a client application that is always running on a users desktop. The application accepts user input and then uses SQLite to perform a few simple queries against a single db file that contains 4 tables. The performance is fantastic after the initial install and normal usage. When the user leaves for the night and tries a query in the morning, the app hangs for 20 seconds and then finally comes back with the results. If the user then duplicates the query immediately afterward, the query is almost instantaneous. In addition, if at any point the user reboots the machine and then retries the query, the same delay happens. The time is spent in the SQLiteDataReader.read() method. Does anybody have any thoughts on why this is happening? Thanks for any help. What is the name of your database (with extension please) ? There are chances that such extension is registered into Windows XP crap called System Restore and each time on start it detect that this file is changed and create a restore snapshot for it. Regards Boguslaw Brandys
Re: [sqlite] Slow query after reboot
Thanks to everyone for all the help on this problem. I am going to try creating a new thread to touch the tables at startup. Chris Schirlinger wrote: We have the same issue, to get around it we fire a thread when the program starts, intelligently "touching" every table that the user is likely to access (As Michael Sizaki already mentioned a select count(last_column) from big_table; will do it) Since a user is very unlikely to run a program and start doing anything complex within the first 10 to 20 seconds, by the time they DO try and run a query, it is the "second time" the query has been run and this we get the benefit of the Windows XP cache and thus the fast speed I have created a client application that is always running on a users desktop. The application accepts user input and then uses SQLite to perform a few simple queries against a single db file that contains 4 tables. The performance is fantastic after the initial install and normal usage. When the user leaves for the night and tries a query in the morning, the app hangs for 20 seconds and then finally comes back with the results. If the user then duplicates the query immediately afterward, the query is almost instantaneous. In addition, if at any point the user reboots the machine and then retries the query, the same delay happens. The time is spent in the SQLiteDataReader.read() method. Does anybody have any thoughts on why this is happening? Thanks for any help.
Re: [sqlite] Slow query after reboot
We have the same issue, to get around it we fire a thread when the program starts, intelligently "touching" every table that the user is likely to access (As Michael Sizaki already mentioned a select count(last_column) from big_table; will do it) Since a user is very unlikely to run a program and start doing anything complex within the first 10 to 20 seconds, by the time they DO try and run a query, it is the "second time" the query has been run and this we get the benefit of the Windows XP cache and thus the fast speed > I have created a client application that is always running on a users > desktop. The application accepts user input and then uses SQLite to > perform a few simple queries against a single db file that contains 4 > tables. The performance is fantastic after the initial install and > normal usage. When the user leaves for the night and tries a query in > the morning, the app hangs for 20 seconds and then finally comes back > with the results. If the user then duplicates the query immediately > afterward, the query is almost instantaneous. In addition, if at any > point the user reboots the machine and then retries the query, the same > delay happens. The time is spent in the SQLiteDataReader.read() > method. Does anybody have any thoughts on why this is happening? > Thanks for any help. > >
Re: [sqlite] Slow query after reboot
Geoff Simonds wrote: The app is running on Windows XP machines Is it possible that indexing services are enabled and XP is trying to index the database file?
Re[2]: [sqlite] Slow query after reboot
Hello Clark, I don't use a virus scanner and observe the same slow initial performance. I do believe it's possible it's the disk cache though, my disks are quite fast (15K SCSI). It acts more like some initialization phase in the DB than it does disk IO delay. C Thursday, January 19, 2006, 1:57:57 PM, you wrote: CC> Also possible, maybe even likely, is the user's anti-virus CC> software is scanning the DB file when the app is first opened. CC> 35MB is a big file for A-V to have to scan. CC> You or they may be able to configure the A-V to ignore the DB file. CC> -Clark CC> - Original Message CC> From: Geoff Simonds <[EMAIL PROTECTED]> CC> To: sqlite-users@sqlite.org CC> Sent: Thursday, January 19, 2006 7:52:55 AM CC> Subject: Re: [sqlite] Slow query after reboot CC> My table contains about 500,000 rows and 4 columns, not all that much CC> data. The overall size of the db file is 35 mb. Does 15 - 20 seconds CC> sound right to load from disk into memory? CC> Robert Simpson wrote: >> - Original Message - From: "Geoff Simonds" <[EMAIL PROTECTED]> >> >>> >> >>> The app is running on Windows XP machines and I assume that disk >>> files are cached. The strange thing is that the time it takes for >>> the initial read into RAM after install and first use is >>> significantly shorter than after a reboot. For example, if you just >>> installed the app and start it, the first time you do a query you see >>> results in about 2 seconds. Subsequent queries come back much almost >>> instantaneously. If the user reboots the machine or waits until the >>> next day and performs the same query, it now takes about 15 seconds. >>> After the 15 seconds, results come back and subsequent queries are >>> instantaneous. I am not sure if this has anything to do with it but >>> the app is a Deskband that lives in the taskbar on windows. >> >> >> >> That's not so strange, really. When the app is installed (along with >> the database), the Windows disk cache probably has at least part of >> the database file cached -- afterall it just got finished writing it. >> >> Robert >> >> >> >> >> -- Best regards, Tegmailto:[EMAIL PROTECTED]
Re: [sqlite] Slow query after reboot
Also possible, maybe even likely, is the user's anti-virus software is scanning the DB file when the app is first opened. 35MB is a big file for A-V to have to scan. You or they may be able to configure the A-V to ignore the DB file. -Clark - Original Message From: Geoff Simonds <[EMAIL PROTECTED]> To: sqlite-users@sqlite.org Sent: Thursday, January 19, 2006 7:52:55 AM Subject: Re: [sqlite] Slow query after reboot My table contains about 500,000 rows and 4 columns, not all that much data. The overall size of the db file is 35 mb. Does 15 - 20 seconds sound right to load from disk into memory? Robert Simpson wrote: > - Original Message - From: "Geoff Simonds" <[EMAIL PROTECTED]> > >> > >> The app is running on Windows XP machines and I assume that disk >> files are cached. The strange thing is that the time it takes for >> the initial read into RAM after install and first use is >> significantly shorter than after a reboot. For example, if you just >> installed the app and start it, the first time you do a query you see >> results in about 2 seconds. Subsequent queries come back much almost >> instantaneously. If the user reboots the machine or waits until the >> next day and performs the same query, it now takes about 15 seconds. >> After the 15 seconds, results come back and subsequent queries are >> instantaneous. I am not sure if this has anything to do with it but >> the app is a Deskband that lives in the taskbar on windows. > > > > That's not so strange, really. When the app is installed (along with > the database), the Windows disk cache probably has at least part of > the database file cached -- afterall it just got finished writing it. > > Robert > > > > >
Re: [sqlite] Slow query after reboot
Thanks for the info and suggestions Michael. I will give this a try. Michael Sizaki wrote: Geoff Simonds wrote: My table contains about 500,000 rows and 4 columns, not all that much data. The overall size of the db file is 35 mb. Does 15 - 20 seconds sound right to load from disk into memory? Yes it does. The problem is, that your query is probably not reading sequentially from disk. Therefore the disk head has to jump forth and back. Once the entire database in in the OS disk cache, queries are fast, because it's only CPU bound and not disk bound anymore. To speedup the initial access, you can: - read the entire file once before you start your query - run the following query (once) select count(last_column) from big_table; this will touch each record in a kind of optimal order - if that is still slow, try VACUUM on your database. This brings the records in a natural order. I have an application that deals also with about 500,000 and the database size is about 100mb. Queries on a "cold" database are extremely slow Michael Robert Simpson wrote: - Original Message - From: "Geoff Simonds" <[EMAIL PROTECTED]> The app is running on Windows XP machines and I assume that disk files are cached. The strange thing is that the time it takes for the initial read into RAM after install and first use is significantly shorter than after a reboot. For example, if you just installed the app and start it, the first time you do a query you see results in about 2 seconds. Subsequent queries come back much almost instantaneously. If the user reboots the machine or waits until the next day and performs the same query, it now takes about 15 seconds. After the 15 seconds, results come back and subsequent queries are instantaneous. I am not sure if this has anything to do with it but the app is a Deskband that lives in the taskbar on windows. That's not so strange, really. When the app is installed (along with the database), the Windows disk cache probably has at least part of the database file cached -- afterall it just got finished writing it. Robert
Re: [sqlite] Slow query after reboot
Geoff Simonds wrote: My table contains about 500,000 rows and 4 columns, not all that much data. The overall size of the db file is 35 mb. Does 15 - 20 seconds sound right to load from disk into memory? Yes it does. The problem is, that your query is probably not reading sequentially from disk. Therefore the disk head has to jump forth and back. Once the entire database in in the OS disk cache, queries are fast, because it's only CPU bound and not disk bound anymore. To speedup the initial access, you can: - read the entire file once before you start your query - run the following query (once) select count(last_column) from big_table; this will touch each record in a kind of optimal order - if that is still slow, try VACUUM on your database. This brings the records in a natural order. I have an application that deals also with about 500,000 and the database size is about 100mb. Queries on a "cold" database are extremely slow Michael Robert Simpson wrote: - Original Message - From: "Geoff Simonds" <[EMAIL PROTECTED]> The app is running on Windows XP machines and I assume that disk files are cached. The strange thing is that the time it takes for the initial read into RAM after install and first use is significantly shorter than after a reboot. For example, if you just installed the app and start it, the first time you do a query you see results in about 2 seconds. Subsequent queries come back much almost instantaneously. If the user reboots the machine or waits until the next day and performs the same query, it now takes about 15 seconds. After the 15 seconds, results come back and subsequent queries are instantaneous. I am not sure if this has anything to do with it but the app is a Deskband that lives in the taskbar on windows. That's not so strange, really. When the app is installed (along with the database), the Windows disk cache probably has at least part of the database file cached -- afterall it just got finished writing it. Robert
Re: [sqlite] Slow query after reboot
- Original Message - From: "Geoff Simonds" <[EMAIL PROTECTED]> My table contains about 500,000 rows and 4 columns, not all that much data. The overall size of the db file is 35 mb. Does 15 - 20 seconds sound right to load from disk into memory? I can't tell you that until the following are answered: 1. What's the query? 2. What's the database schema? Indexes? 3. Are you calling prepare/step/finalize to load the data or using some other method? 4. How are you storing the data in memory? linked lists? Pre-allocated array? The first two are the most important factors. Robert
Re: [sqlite] Slow query after reboot
My table contains about 500,000 rows and 4 columns, not all that much data. The overall size of the db file is 35 mb. Does 15 - 20 seconds sound right to load from disk into memory? Robert Simpson wrote: - Original Message - From: "Geoff Simonds" <[EMAIL PROTECTED]> The app is running on Windows XP machines and I assume that disk files are cached. The strange thing is that the time it takes for the initial read into RAM after install and first use is significantly shorter than after a reboot. For example, if you just installed the app and start it, the first time you do a query you see results in about 2 seconds. Subsequent queries come back much almost instantaneously. If the user reboots the machine or waits until the next day and performs the same query, it now takes about 15 seconds. After the 15 seconds, results come back and subsequent queries are instantaneous. I am not sure if this has anything to do with it but the app is a Deskband that lives in the taskbar on windows. That's not so strange, really. When the app is installed (along with the database), the Windows disk cache probably has at least part of the database file cached -- afterall it just got finished writing it. Robert
Re: [sqlite] Slow query after reboot
- Original Message - From: "Geoff Simonds" <[EMAIL PROTECTED]> The app is running on Windows XP machines and I assume that disk files are cached. The strange thing is that the time it takes for the initial read into RAM after install and first use is significantly shorter than after a reboot. For example, if you just installed the app and start it, the first time you do a query you see results in about 2 seconds. Subsequent queries come back much almost instantaneously. If the user reboots the machine or waits until the next day and performs the same query, it now takes about 15 seconds. After the 15 seconds, results come back and subsequent queries are instantaneous. I am not sure if this has anything to do with it but the app is a Deskband that lives in the taskbar on windows. That's not so strange, really. When the app is installed (along with the database), the Windows disk cache probably has at least part of the database file cached -- afterall it just got finished writing it. Robert
Re: [sqlite] Slow query after reboot
The app is running on Windows XP machines and I assume that disk files are cached. The strange thing is that the time it takes for the initial read into RAM after install and first use is significantly shorter than after a reboot. For example, if you just installed the app and start it, the first time you do a query you see results in about 2 seconds. Subsequent queries come back much almost instantaneously. If the user reboots the machine or waits until the next day and performs the same query, it now takes about 15 seconds. After the 15 seconds, results come back and subsequent queries are instantaneous. I am not sure if this has anything to do with it but the app is a Deskband that lives in the taskbar on windows. Jay Sprenkle wrote: On 1/19/06, Geoff Simonds <[EMAIL PROTECTED]> wrote: I have created a client application that is always running on a users desktop. The application accepts user input and then uses SQLite to perform a few simple queries against a single db file that contains 4 tables. The performance is fantastic after the initial install and normal usage. When the user leaves for the night and tries a query in the morning, the app hangs for 20 seconds and then finally comes back with the results. If the user then duplicates the query immediately afterward, the query is almost instantaneous. In addition, if at any point the user reboots the machine and then retries the query, the same delay happens. The time is spent in the SQLiteDataReader.read() method. Does anybody have any thoughts on why this is happening? Thanks for any help. Does your system cache disk files? The first time it reads the file into cache and after that it's read from RAM.
Re: [sqlite] Slow query after reboot
On 1/19/06, Geoff Simonds <[EMAIL PROTECTED]> wrote: > I have created a client application that is always running on a users > desktop. The application accepts user input and then uses SQLite to > perform a few simple queries against a single db file that contains 4 > tables. The performance is fantastic after the initial install and > normal usage. When the user leaves for the night and tries a query in > the morning, the app hangs for 20 seconds and then finally comes back > with the results. If the user then duplicates the query immediately > afterward, the query is almost instantaneous. In addition, if at any > point the user reboots the machine and then retries the query, the same > delay happens. The time is spent in the SQLiteDataReader.read() > method. Does anybody have any thoughts on why this is happening? > Thanks for any help. Does your system cache disk files? The first time it reads the file into cache and after that it's read from RAM.
[sqlite] Slow query after reboot
I have created a client application that is always running on a users desktop. The application accepts user input and then uses SQLite to perform a few simple queries against a single db file that contains 4 tables. The performance is fantastic after the initial install and normal usage. When the user leaves for the night and tries a query in the morning, the app hangs for 20 seconds and then finally comes back with the results. If the user then duplicates the query immediately afterward, the query is almost instantaneous. In addition, if at any point the user reboots the machine and then retries the query, the same delay happens. The time is spent in the SQLiteDataReader.read() method. Does anybody have any thoughts on why this is happening? Thanks for any help.