Re: [sqlite] Questions about "analyze"
On Oct 20, 2008, at 2:57 AM, Clodo wrote: > Thanks for you answer MikeW. >> Could you put the update records into their own separate table > I already solved my problem using the newest "INDEXED BY". > I'm trying to understand if is a SqLite limit or bug. It's an unfortunate edge case alright. SQLite's query optimizer is very simple. Running ANALYZE sometimes improves the decisions it makes, but also sometimes not. If you must use ANALYZE, overriding it's decision using INDEXED BY is probably your best solution at this point. As you are aware, this is brand new in 3.6.4, so your queries will be syntax errors > And i'm trying to understand your answer about the hashes :( SQLite doesn't use hash indexes. And I'm not sure that it would make any difference to this kind of case if it did. (it's not a crazy suggestion though - a lot of database systems do use hash indexes). > For the moment, here you can found the comparison between SqLite, > MySql > and Oracle9 with the minimal full test-case. > Only sqlite do a full-table-scan after an "analyze". But i'm not 100% > sure about the right syntax for MySql and Oracle. > --- > SqLite (from new empty database) > > SQL> CREATE TABLE mytest > ( > mycodeint NOT NULL, > mymarkint NOT NULL > ); > SQL> insert into mytest select 0,0; > SQL> insert into mytest select * from mytest; > -- rerun the lastest 21 times, until i reach 2 millions of records. > SQL> create index myindex on mytest (mymark); > SQL> explain query plan select mycode from mytest where mymark=1; > detail: TABLE mytest WITH INDEX myindex > SQL> analyze > SQL> explain query plan select mycode from mytest where mymark=1; > detail: TABLE mytest > --- > Oracle: > > SQL> CREATE TABLE mytest > ( > mycodeint NOT NULL, > mymarkint NOT NULL > ); > SQL> insert into mytest select 0,0 from dual; > SQL> insert into mytest select * from mytest; > -- rerun the lastest 21 times, until i reach 2 millions of records. > SQL> create index myindex on mytest (mymark) tablespace users; > -- with sqlite-autotrace enabled > SQL> select mycode from mytest where mymark=1; > Execution Plan > 0 SELECT STATEMENT Optimizer=CHOOSE > 10 TABLE ACCESS (BY INDEX ROWID) OF 'MYTEST' > 21 INDEX (RANGE SCAN) OF 'MYINDEX' (NON-UNIQUE) > SQL> analyze table mytest compute statistics; > SQL> explain query plan select mycode from mytest where mymark=1; > Execution Plan > 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=2) > 10 TABLE ACCESS (BY INDEX ROWID) OF 'MYTEST' (Cost=4 Card=1 > Bytes=2) > 21 INDEX (RANGE SCAN) OF 'MYINDEX' (NON-UNIQUE) (Cost=3 > Card=1) > > --- > MySql: > > SQL> CREATE TABLE mytest > ( > mycodeint NOT NULL, > mymarkint NOT NULL > ); > SQL> insert into mytest select 0,0; > SQL> insert into mytest select * from mytest; > -- rerun the lastest 21 times, until i reach 2 millions of records. > SQL> create index myindex on mytest (mymark); > SQL> explain select mycode from mytest where mymark=1; > id select_type table type possible_keys key > key_len ref rows Extra > 1SIMPLEmytestrefmyindexmyindex4const1 > SQL> analyze table mytest; > SQL> explain select mycode from mytest where mymark=1; > id select_type table type possible_keys key > key_len ref rows Extra > 1SIMPLEmytestrefmyindexmyindex4const1 > > >> Don't have the source to hand, but it's possible that SQLite does >> indexes >> by hash, in which case it wouldn't "know" that the field values are >> all >> the same, just that they had the same hash-value, hence it will have >> to do a linear search, which it will always have to do if hashes >> are the same. >> >> Could you put the update records into their own separate table, >> then move them into the main one later - that would save all the >> slow accesses. >> >> Regards, >> MikeW >> >> >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> >> > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] string values are messed up with ODBC and prepared statements
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Guenther Schmidt wrote: > when I use inserts through prepared statements on windows with ODBC > all text field values are messed up. Either the values you supply to the ODBC driver, or the ODBC driver itself is at fault (most likely the former). (SQLite fundamentally does work correctly with all the documented types including strings.) Start with http://article.gmane.org/gmane.comp.db.sqlite.general/41828 You can try simple tests such as executing these queries and seeing what you get back: SELECT 3 SELECT 'abc' SELECT 'abcæ' SELECT 'abcædef' SELECT 'ரறல' Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkj8ApQACgkQmOOfHg372QS+NACdE0eIDqUUm67KZ/0nxYhEEOSe oz0AnAz5/0pdlcA8kc5AEoOfUNrxx0M8 =lxAN -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] How to get the total row number o f a table by Sqlite more efficient?
HI all, I want to get the total rows of a table, here are my methods: 1、 int total_row; sqlite3_prepare(db,"select count(ColName) from tablename",-1,,0); sqlite3_step(stmt); total_row = sqlite3_column_int(stmt, 0); 2、 sqlite3_gettable((db, "select * from tablename", , , , ); I can get the total rows of a table with this tow way, but the time use does not meet the requirement. Test in my embedded Linux on Coldfire MCF5407 with a table of 3 records, It spends about 50ms to get the rownum with first way, and the second way is more poor. Is there some more efficient methods or API interface with Sqlite?Because my program invoke this interface very frequency. Thanks very much! yoky ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Questions about "analyze"
> Hi Clodo, > > I believe Sqlite uses B-trees, unless one links in the R-tree module or > uses the FTS feature (Full Text Search). > > I don't know almost nothing about the internal implementation of a database engine, i need to study to understand your reply :) > I believe you wrote earlier that the new "INDEXED BY" feature solved > your problem, but you saw something in the documentation that dissuaded > you. Do you mind saying just what it was that seemed problematic? > > Sorry but i have some difficult to understand your english... if i understand your message correctly, i actually use "INDEXED BY" to force the use of the index, but this page: http://www.sqlite.org/lang_indexedby.html say: "Some SQL database engines provide non-standard "hint" mechanisms which can be used to give the query optimizer clues about what indices it should use for a particular statement. The INDEX BY clause of SQLite is /not/ a hinting mechanism and it should not be used as such." [...] "The INDEXED BY clause is /not/ intended for use in tuning the preformance of a query. The intent of the INDEXED BY clause is to raise a run-time error if a schema change". For that i'm looking for another better solution. > To mention a variation of MikeW's suggestion, since you tend to have > millions of records, with often only a few requiring an update, you > might want to keep a small table containing only the record numbers of > the rows that need updating. The increase in disk space sounds like it > would be minimal. > Yes, - an additional table is a solution; - using timestamp "last_check" and not a boolean is another solution, because with a "variation" of values the index are used also after an analyze; - and maybe exists other solutions. But... i'm optimizing/tuning my software, and i discover this problem "random" during hundred of "explain query"... i'm trying to understand the "logic" (if exists) behind this problem, to understand if other query with a similar problem can be also affected. And also for giving feedback, for improving SqLite, -IF- this is a bug :) Thanks for you reply! Bye ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Best tutorial on sqlite/relational databases?
On 10/19/08, circ ular <[EMAIL PROTECTED]> wrote: > I have googled a lot but haven't found any really good tutorials on > sqlite and/or relational databases in general. You haven't Googled (TM) well enough then. There are lots and lots of tutorials on rdbms, probably as many as those for building cgi web counters. Whether they are "really" good or not, well, you will have to peruse them and decide. The SQLite website and wiki are good sources of SQLite-specific stuff as well. This mailing list's archive is a fantastic resource. There is a SQLite book out there somewhere that many talk highly about (search on your favorite book store's website). -- Puneet Kishor http://punkish.eidesis.org/ Nelson Institute for Environmental Studies http://www.nelson.wisc.edu/ Open Source Geospatial Foundation (OSGeo) http://www.osgeo.org/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Help with design of mini-search-engine-db!?
On 10/19/08, circ ular <[EMAIL PROTECTED]> wrote: > I am writing a mini-search engine. > > I have a bunch of posts from a forum that I want to index so I can search it. > > So I want to be able to do lookups on words and then get every post it > has been seen in(I have parsed all this already). > > So when I do : > quickQuery' conn "SELECT * from proglangs where id == 'java'" [] > it should return a list of the posts. > > or should i have one table for each keyword!? seems expensive. > > is it possible to have table-elements that point to other tables? I would use the right tool for the right job. For the kind of job described above, full-text search would be appropriated, and wait, it has already been built by folks from that small search company out West. Implement FTS3. -- Puneet Kishor http://punkish.eidesis.org/ Nelson Institute for Environmental Studies http://www.nelson.wisc.edu/ Open Source Geospatial Foundation (OSGeo) http://www.osgeo.org/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Best tutorial on sqlite/relational databases?
I have googled a lot but haven't found any really good tutorials on sqlite and/or relational databases in general. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Help with design of mini-search-engine-db!?
I am writing a mini-search engine. I have a bunch of posts from a forum that I want to index so I can search it. So I want to be able to do lookups on words and then get every post it has been seen in(I have parsed all this already). So when I do : quickQuery' conn "SELECT * from proglangs where id == 'java'" [] it should return a list of the posts. or should i have one table for each keyword!? seems expensive. is it possible to have table-elements that point to other tables? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to speed up read-only databases?
The sqlite3.exe program is set up as a utility and maintenance tool, not a production environment and is designed to that end. If you want maximum performance it is not the way to go; instead embed the Sqlite calls inside your application and optimize access. If you are performing ad-hoc DB tasks then it or one of the many similar function Ssqlite tools are appropriate. One is the Firefox plug in. You can imagine that having to compile the SQL for over and over instead of storing and re-using the compiled code adds considerably to overhead on frequently run jobs.. JS Christophe Leske wrote: > John, > > thanks for your suggestions, but i am wondering if any of your > suggestions can be applied to the sqlite3.exe command line application? > > > >>Prepare your statements only once and then use bind. Do not use >> > > How is this done? Can the command line executable be modified in such a way? > > >>sqlite3_exec. Do not open and close the DB for each read, instead open >>once and let the cache work. > > That´s already done. > >>Avoid row scans by defining indices. > > Already there. > >> Use >>the new index selection functionality to force the use of the best >>index. > > Would you care to provide more information about this? A simple link > would be enough. > > >>Place large and less frequently accessed columns at the end of >>the Sqlite rows. >> > > Ok, thanks. Will do. However, i read out the whole row all the time. > BTW, does it help to specify which coloumns i would like instead of all? > I would like to get all minus one coloumn in general. > >>When you obey these rules you will get very good read perfprmance from >>Sqlite. The cacheing is important if you are using a slow disk or flash >>memory. Look at shared cache mode if you have multiple users > > Nope, just one from DVD. > > Thanks again, > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to speed up read-only databases?
John, thanks for your suggestions, but i am wondering if any of your suggestions can be applied to the sqlite3.exe command line application? > Prepare your statements only once and then use bind. Do not use > How is this done? Can the command line executable be modified in such a way? > sqlite3_exec. Do not open and close the DB for each read, instead open > once and let the cache work. That´s already done. > Avoid row scans by defining indices. Already there. > Use > the new index selection functionality to force the use of the best > index. Would you care to provide more information about this? A simple link would be enough. > Place large and less frequently accessed columns at the end of > the Sqlite rows. > Ok, thanks. Will do. However, i read out the whole row all the time. BTW, does it help to specify which coloumns i would like instead of all? I would like to get all minus one coloumn in general. > When you obey these rules you will get very good read perfprmance from > Sqlite. The cacheing is important if you are using a slow disk or flash > memory. Look at shared cache mode if you have multiple users Nope, just one from DVD. Thanks again, -- Christophe Leske www.multimedial.de - [EMAIL PROTECTED] http://www.linkedin.com/in/multimedial Lessingstr. 5 - 40227 Duesseldorf - Germany 0211 261 32 12 - 0177 249 70 31 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Questions about "analyze"
Hi Clodo, I believe Sqlite uses B-trees, unless one links in the R-tree module or uses the FTS feature (Full Text Search). I believe you wrote earlier that the new "INDEXED BY" feature solved your problem, but you saw something in the documentation that dissuaded you. Do you mind saying just what it was that seemed problematic? To mention a variation of MikeW's suggestion, since you tend to have millions of records, with often only a few requiring an update, you might want to keep a small table containing only the record numbers of the rows that need updating. The increase in disk space sounds like it would be minimal. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Questions about "analyze"
Thanks for you answer MikeW. >Could you put the update records into their own separate table I already solved my problem using the newest "INDEXED BY". I'm trying to understand if is a SqLite limit or bug. And i'm trying to understand your answer about the hashes :( For the moment, here you can found the comparison between SqLite, MySql and Oracle9 with the minimal full test-case. Only sqlite do a full-table-scan after an "analyze". But i'm not 100% sure about the right syntax for MySql and Oracle. --- SqLite (from new empty database) SQL> CREATE TABLE mytest ( mycodeint NOT NULL, mymarkint NOT NULL ); SQL> insert into mytest select 0,0; SQL> insert into mytest select * from mytest; -- rerun the lastest 21 times, until i reach 2 millions of records. SQL> create index myindex on mytest (mymark); SQL> explain query plan select mycode from mytest where mymark=1; detail: TABLE mytest WITH INDEX myindex SQL> analyze SQL> explain query plan select mycode from mytest where mymark=1; detail: TABLE mytest --- Oracle: SQL> CREATE TABLE mytest ( mycodeint NOT NULL, mymarkint NOT NULL ); SQL> insert into mytest select 0,0 from dual; SQL> insert into mytest select * from mytest; -- rerun the lastest 21 times, until i reach 2 millions of records. SQL> create index myindex on mytest (mymark) tablespace users; -- with sqlite-autotrace enabled SQL> select mycode from mytest where mymark=1; Execution Plan 0 SELECT STATEMENT Optimizer=CHOOSE 10 TABLE ACCESS (BY INDEX ROWID) OF 'MYTEST' 21 INDEX (RANGE SCAN) OF 'MYINDEX' (NON-UNIQUE) SQL> analyze table mytest compute statistics; SQL> explain query plan select mycode from mytest where mymark=1; Execution Plan 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=2) 10 TABLE ACCESS (BY INDEX ROWID) OF 'MYTEST' (Cost=4 Card=1 Bytes=2) 21 INDEX (RANGE SCAN) OF 'MYINDEX' (NON-UNIQUE) (Cost=3 Card=1) --- MySql: SQL> CREATE TABLE mytest ( mycodeint NOT NULL, mymarkint NOT NULL ); SQL> insert into mytest select 0,0; SQL> insert into mytest select * from mytest; -- rerun the lastest 21 times, until i reach 2 millions of records. SQL> create index myindex on mytest (mymark); SQL> explain select mycode from mytest where mymark=1; id select_type table type possible_keys key key_len ref rows Extra 1SIMPLEmytestrefmyindexmyindex4const1 SQL> analyze table mytest; SQL> explain select mycode from mytest where mymark=1; id select_type table type possible_keys key key_len ref rows Extra 1SIMPLEmytestrefmyindexmyindex4const1 > Don't have the source to hand, but it's possible that SQLite does indexes > by hash, in which case it wouldn't "know" that the field values are all > the same, just that they had the same hash-value, hence it will have > to do a linear search, which it will always have to do if hashes are the same. > > Could you put the update records into their own separate table, > then move them into the main one later - that would save all the slow > accesses. > > Regards, > MikeW > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multiple Selects
MikeW wrote: > Andrew Gatt <[EMAIL PROTECTED]> writes: > > >> I'm not sure if i'm missing something, but is there an efficient way of >> retrieving multiple rows based on different conditions in order. For >> example i have a table with rows of ids, i want to select multiple rows >> at a time. At present i am doing a "SELECT name FROM table WHERE id = x" >> for each row i want and then stitching it all together. But i'm finding >> this is quite slow even on a moderately small database (2000 entries). >> >> I'm guessing my SQL is the worst way of doing things so i've been trying >> to find a better method. I stumbled across "SELECT name FROM table WHERE >> id IN (x,y,z) however this doesn't allow me to specify the order the >> rows are returned, which i must have. >> >> The only other option i can find is using UNION ALL in between multiple >> SELECT statements, but would this give me a large performance increase >> over doing this progammatically as i've got it? >> >> Unless i've missed something obvious which could well be the case! >> >> Andrew >> > > Since performance is the issue, wouldn't it just be better to get all the > rows 'as they come' - into a hash or similar lookup-optimised structure, > and then get them from that in the required order ? > > I would have thought that all those DB queries would be slow > when you can get a smaller subset into an efficiently accessed > container in memory ... > > Incidentally, I trust your query is > "SELECT name FROM table WHERE id = ?" and you are doing a sqlite3_bind() > each time, no need to re-prepare the statement every time !! > > Regards, > MikeW > > > Thanks for everyones suggestions. The order of the ids will be chosen by the user at run time and so will be truly arbitrary. As this could be run in several threads at the same time on the database i'm not sure i'd like to add temporary tables or extra columns - i guess i would have to create temporary tables with different names to avoid conflicts? The bind api is not something i'd come across and will look into. With the index in place accessing the database in the way described is an order of magnitude faster (which is probably obvious to most people) and will suffice for now. Andrew ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Questions about "analyze"
Fabrizio Carimati <[EMAIL PROTECTED]> writes: > > Hi to all, > I have a table with many record, that have a field normally with value '0'. > Occasionally i update some records by setting the field to '1' to mark > it, and after in a background job, i localized them for working on it. > For that, i have an index on that field. > My problem: if i run an "analyze" when all records have the field equal > to '0', the localization (select .. where field != 0) don't use the > index, and do a full-table-scan. > If at least one record is set to '1', the index are used. > > It's seem that "analyze" discover that all records have same field > value, so mark the index "useless". > > Solutions that i know: > - run "analyze" after updating to '1' some records. :( > - maintain at least one fake record with '1' :( > - never launch "analyze" :( > > Or... (please help!) > - exists some options to control this behiavour of analyze? > - it's possible to force the use of the index? > > Thanks for feedback! > Don't have the source to hand, but it's possible that SQLite does indexes by hash, in which case it wouldn't "know" that the field values are all the same, just that they had the same hash-value, hence it will have to do a linear search, which it will always have to do if hashes are the same. Could you put the update records into their own separate table, then move them into the main one later - that would save all the slow accesses. Regards, MikeW ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multiple Selects
Andrew Gatt <[EMAIL PROTECTED]> writes: > > I'm not sure if i'm missing something, but is there an efficient way of > retrieving multiple rows based on different conditions in order. For > example i have a table with rows of ids, i want to select multiple rows > at a time. At present i am doing a "SELECT name FROM table WHERE id = x" > for each row i want and then stitching it all together. But i'm finding > this is quite slow even on a moderately small database (2000 entries). > > I'm guessing my SQL is the worst way of doing things so i've been trying > to find a better method. I stumbled across "SELECT name FROM table WHERE > id IN (x,y,z) however this doesn't allow me to specify the order the > rows are returned, which i must have. > > The only other option i can find is using UNION ALL in between multiple > SELECT statements, but would this give me a large performance increase > over doing this progammatically as i've got it? > > Unless i've missed something obvious which could well be the case! > > Andrew Since performance is the issue, wouldn't it just be better to get all the rows 'as they come' - into a hash or similar lookup-optimised structure, and then get them from that in the required order ? I would have thought that all those DB queries would be slow when you can get a smaller subset into an efficiently accessed container in memory ... Incidentally, I trust your query is "SELECT name FROM table WHERE id = ?" and you are doing a sqlite3_bind() each time, no need to re-prepare the statement every time !! Regards, MikeW ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users