Re: [sqlite] Optimizing an insert/update
Michael Miller wrote: > "If the string doesn't exist in the table, create a new row with > the string in the first column and 1 in the second column. If > the string does exist in the table, increment the second column > by 1" One way you can do it is to insert everything into a temporary table which doesn't require the string to be unique, and then do a select sum(count),string from temp group by string; create table mystrings (string text primary key, count integer); create table mytemp (string text, count integer); .import mydata.txt mytemp insert into mystrings (string, count) select string, sum(count) from mytemp group by string; As a complement to the "insert or replace" it would be nice to have "insert or accumulate" for this kind of a job. However, that would go outside of the SQL standard. -- Lars Aronsson ([EMAIL PROTECTED]) Aronsson Datateknik - http://aronsson.se ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How long time to index this table
Gilles Ganault wrote: > But then, how many people use SQLite to handle 68 milions rows > in a table? That's a good question. I don't know. And I don't know if there is a recommended size for SQLite databases. But I was able to create the index in 12 minutes after I set the right cache_size. The usage I foresee for SQLite is that structured data is provided for download and personal analysis, rather than XML dumps or Excel spreadsheets. And today I can easily download a 3 gigabyte database file, which is only 600 megabytes compressed. Over on the unison mailing list (Unison is a tool for doing 2-way rsync, file transfers) we're constantly amazed that people complain they are unable to transfer terabytes in a single transfer. Who could have imagined. Maybe Gordon Moore. My 68 million rows come from the Swedish Wikipedia, which is my small experimentation base before I try this on the full size German or English Wikipedia. But I might have to back down to the Estonian or Faroese Wikipedia to get an even smaller dataset. -- Lars Aronsson ([EMAIL PROTECTED]) Aronsson Datateknik - http://aronsson.se ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] How long time to index this table
I have a database table with 68 million rows and 4 columns, of which 3 are integers and one is a short text string. I'm now trying to create an index on one of the integer columns, that I forgot to create before I populated the table. But it takes for ever and doesn't seem to stop. I estimate that each row takes 20 bytes, so the 68 million rows should be a little over a gigabyte. Together with some other data, the database file is 3.1 gigabytes. You can download a compressed version (638 MB using bzip2) from http://mirabell.runeberg.lysator.liu.se/sv-counts-20080219.db.bz2 The schema (767 bytes) is available at http://mirabell.runeberg.lysator.liu.se/sv-counts-20080219.schema The operation I'm trying is create index counts_2 on counts(name); I'm using SQLite 3.4.2, which is the standard package in Ubuntu Linux 7.10. I'm running this on an otherwise idle Intel Core 2 Duo CPU with 4 GB of RAM. The operation takes several hours and still running. Performance graphs indicate lots of iowait, with many thousand blocks being written to disk each second. But very little CPU power is being used. Is that reasonable? Do I need to use PRAGMA cache_size? Is there a good way to find out what sqlite3 is doing? I know the "explain" command lists the byte code, but can it predict how long an operation will take? -- Lars Aronsson ([EMAIL PROTECTED]) Aronsson Datateknik - http://aronsson.se ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Innovative examples / user stories
Is there any documentation of how people use SQLite in odd ways in their everyday activities? For example, do you e-mail SQLite DB files between you, as if they were Excel spreadsheets? Or do you distribute SQLite database files via BitTorrent? Even with multi table databases? That would be a kind of database use that was unheard of in the Oracle/DB2 era, but it certainly should be possible with SQLite. SQLite databases files could be used in a "seti @ home" kind of application, where a screensaver downloads an existing DB file, processes it for some hours, and then uploads the resulting DB file again. But are any such applications or user stories known? Is there a list already? I've read the Wikipedia article and its list of Google Gears, Android, Mac OS X Tiger, etc. I've also seen the "well-known users of SQLite" page on sqlite.org/famous.html and the "When to use SQLite" page. I've used Oracle since version 7 and MySQL since 3.23 (which isn't very long at all, but anyway). The switch from Oracle to MySQL was made possible for many people because their database was embedded behind a web application anyway (look, no DBA!), so full transaction handling wasn't really needed. That shift in usage pattern opened up for a simpler and more affordable solution. Many other such technology shifts are described in an old book titled "The Innovator's Dilemma". -- Lars Aronsson ([EMAIL PROTECTED]) Aronsson Datateknik - http://aronsson.se ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users