Re: [sqlite] sqlite-users Digest, Vol 80, Issue 19
On 08/19/2014 11:00 AM, sqlite-users-requ...@sqlite.org wrote: 10. Re: Long time to drop tables. (Jonathan Moules) 12. Re: Long time to drop tables. (Simon Slavin) -- Long time for me too! This was run on an 11.7GB database where the dropped table is about 50% of total size. CPU is AMD AM3+ X4 4.2Ghz sqlite> select count(*) from fvall; count(*) -- 6524250 CPU Time: user 0.248015 sys 13.084817 sqlite> pragma auto_vacuum; auto_vacuum --- 0 CPU Time: user 0.00 sys 0.00 sqlite> drop table fvall; While running... top - 20:51:16 up 11 days, 10:58, 2 users, load average: 1.97, 1.10, 0.62 Tasks: 178 total, 2 running, 175 sleeping, 0 stopped, 1 zombie Cpu0 : 0.7%us, 5.3%sy, 0.0%ni, 90.1%id, 3.8%wa, 0.0%hi, 0.1%si, 0.0%st Cpu1 : 0.8%us, 2.1%sy, 0.0%ni, 93.8%id, 3.2%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.8%us, 3.2%sy, 0.0%ni, 90.9%id, 4.9%wa, 0.0%hi, 0.2%si, 0.0%st Cpu3 : 2.2%us, 8.7%sy, 0.0%ni, 50.8%id, 36.7%wa, 0.0%hi, 1.6%si, 0.0%st Mem: 4031760k total, 3911436k used, 120324k free, 160704k buffers Swap: 3664892k total,57932k used, 3606960k free, 2657068k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 10400 gelmjw20 0 41064 6580 1876 D 22 0.2 0:34.91 sqlite3 10430 root 20 0 000 S1 0.0 0:00.26 kworker/0:0 Sure looks I/O bound. Why so much I/O? CPU Time: user 13.744859 sys 79.016938 What is the above in milliseconds of real time? sqlite> voyager finviz # hdparm -t /dev/sda /dev/sda: Timing buffered disk reads: 582 MB in 3.01 seconds = 193.48 MB/sec voyager finviz # hdparm -t /dev/sda /dev/sda: Timing buffered disk reads: 584 MB in 3.00 seconds = 194.49 MB/sec voyager finviz # hdparm -t /dev/sda /dev/sda: Timing buffered disk reads: 582 MB in 3.01 seconds = 193.53 MB/sec voyager finviz # Is my disk slow? John Gelm ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Importing ~1000 CSV files faster
Joe Fisher wrote on Tuesday, August 19, 2014 5:11 PM > I use the temp table because every CSV files has a header with the > column names. Can you just import the files, header row and all, into your destination table and just delete the 2000 header rows at the end? -- This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Importing ~1000 CSV files faster
On Aug 19, 2014, at 11:11 PM, joe.fis...@tanguaylab.com wrote: > Is there something better I can do to improve this process? PRAGMA journal_mode = off; http://www.sqlite.org/pragma.html#pragma_journal_mode > Perhaps one transaction? Perhaps turn something off? It took about 1.5 hours > to run. I use the temp table because every CSV files has a header with the > column names. I have to drop the table each time because of the header issue. Preprocess your files by dropping the first line beforehand, e.g.: sed -i ā1dā That will save half of your time already. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Importing ~1000 CSV files faster
On 19 Aug 2014, at 10:13pm, John Drescher wrote: >> I'm running the following script on more than 1000 2MB CSV files and I'd >> like to speed it up if possible. I noticed that a 'WAL' is running. Is there >> something better I can do to improve this process? > > Use an SSD. Agree. There's nothing wrong with your script: it's well designed to do what you need with the maximum of elegance. You might get a faster result by writing your own C program to read the CSV file and do the minimum of SQLite commands, but probably not much. 1.5 hours for 2000 CSV files is 2.7 seconds per file. About .2 seconds of that are file handling. The rest indicates that SQLite is running extremely quickly and efficiently for you. If you want anything faster, use faster hardware. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Importing ~1000 CSV files faster
Directly after opening your db: PRAGMA synchronous = OFF; PRAGMA journal_mode = WAL; It should fly then, at the cost of risking db corruption in case of a crash. Ben(jamin Stadin) Am 19.08.14 23:11 schrieb "joe.fis...@tanguaylab.com" unter : >I'm running the following script on more than 1000 2MB CSV files and I'd >like to speed it up if possible. I noticed that a 'WAL' is running. Is >there something better I can do to improve this process? Perhaps one >transaction? Perhaps turn something off? It took about 1.5 hours to run. >I use the temp table because every CSV files has a header with the >column names. I have to drop the table each time because of the header >issue. I'm using the latest version of SQLite on a fast notebook. > >.import 'TP962-A1-P1_TP962-A1-P2_01.CSV' temp_table >delete from temp_table where an = 1; -- using .read del_rec.sql >insert into external_lpr_assay_raw >select *, 'TP962-A1-P1_TP962-A1-P2_01.CSV' as filename, from temp_table; >drop table if exists temp_table; -- using .read drop_table.sql > >Joe Fisher >Oregon State University > > > >___ >sqlite-users mailing list >sqlite-users@sqlite.org >http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Importing ~1000 CSV files faster
> I'm running the following script on more than 1000 2MB CSV files and I'd > like to speed it up if possible. I noticed that a 'WAL' is running. Is there > something better I can do to improve this process? Use an SSD. John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Importing ~1000 CSV files faster
I'm running the following script on more than 1000 2MB CSV files and I'd like to speed it up if possible. I noticed that a 'WAL' is running. Is there something better I can do to improve this process? Perhaps one transaction? Perhaps turn something off? It took about 1.5 hours to run. I use the temp table because every CSV files has a header with the column names. I have to drop the table each time because of the header issue. I'm using the latest version of SQLite on a fast notebook. .import 'TP962-A1-P1_TP962-A1-P2_01.CSV' temp_table delete from temp_table where an = 1; -- using .read del_rec.sql insert into external_lpr_assay_raw select *, 'TP962-A1-P1_TP962-A1-P2_01.CSV' as filename, from temp_table; drop table if exists temp_table; -- using .read drop_table.sql Joe Fisher Oregon State University ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQlite on flash filesystem
On Tue, Aug 19, 2014 at 10:38 AM, Richard Hipp wrote: > > > > On Tue, Aug 19, 2014 at 10:34 AM, Levente Kovacs > wrote: > >> I started to worry about this issue, because I am in a middle of an >> application development, and yesterday, it started to work, and I only >> SELECT >> a few times, and it makes a noticeable disk access. I'm still on magnetic >> HDD, >> but the application will be running on SSD or Flash drive. >> >> Let me start again by a stupid question. If I do a SELECT... does this >> performs any disk write operation? >> > > No. SQLite doesn't. But your filesystem might decide to update the mtime > on an inode. > Correction: If you do a sufficiently complex SELECT, SQLite might need to manifest a view or do a large sort that won't fit in RAM. In those cases it will write to disk. You can disable that using PRAGMA temp_store=MEMORY. http://www.sqlite.org/pragma.html#pragma_temp_store Additional information: http://www.sqlite.org/tempfiles.html > > >> >> I know that sqlite can store the database in RAM. Is there any way to >> (periodically) write the database to a regular sqlite file? >> > > http://www.sqlite.org/backup.html > > > >> >> Does sqlite calls 'sync()' after an UPDATE or INSERT? >> > > http://www.sqlite.org/atomiccommit.html > > -- > D. Richard Hipp > d...@sqlite.org > -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQlite on flash filesystem
On Tue, Aug 19, 2014 at 10:34 AM, Levente Kovacs wrote: > I started to worry about this issue, because I am in a middle of an > application development, and yesterday, it started to work, and I only > SELECT > a few times, and it makes a noticeable disk access. I'm still on magnetic > HDD, > but the application will be running on SSD or Flash drive. > > Let me start again by a stupid question. If I do a SELECT... does this > performs any disk write operation? > No. SQLite doesn't. But your filesystem might decide to update the mtime on an inode. > > I know that sqlite can store the database in RAM. Is there any way to > (periodically) write the database to a regular sqlite file? > http://www.sqlite.org/backup.html > > Does sqlite calls 'sync()' after an UPDATE or INSERT? > http://www.sqlite.org/atomiccommit.html -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQlite on flash filesystem
I started to worry about this issue, because I am in a middle of an application development, and yesterday, it started to work, and I only SELECT a few times, and it makes a noticeable disk access. I'm still on magnetic HDD, but the application will be running on SSD or Flash drive. Let me start again by a stupid question. If I do a SELECT... does this performs any disk write operation? I know that sqlite can store the database in RAM. Is there any way to (periodically) write the database to a regular sqlite file? Does sqlite calls 'sync()' after an UPDATE or INSERT? Thanks, Levente On Sat, 16 Aug 2014 22:22:56 +0100 Simon Slavin wrote: > A good question. By which I mean it's one people should worry about, > not one which we're not sure about. > > No. Physical writes to Flash drives are optimized for best use of > hardware (minimum writes, promoting maximum lifetime for the device > and fastest execution of programs). Worse still, there is no > guarantee that any writes are done to the device at all until you > have issued 'unmount' to it (shown to the user as 'Eject' or 'Make > safe for removal' on popular popular operating systems). Until then, > as far as you're concerned, all the changes may be made only to a > cache held by the device driver. > > So if you pull your Flash drive without telling the OS to unmount it > you can get all sorts of problems, including no files at all if they > were created during that session. It's rare to see this actually > happen (few systems create a cache that reflects the whole drive then > never bother to flush it) but the specs of the hardware involved do > allow it. -- Levente Kovacs CTO, CSO http://levente.logonex.eu ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Long time to drop tables.
On 19 Aug 2014, at 11:41am, Jonathan Moules wrote: > There are some foreign keys. Put all your DROP TABLE commands into one big transaction. Make sure when you DROP your TABLES that you are dropping them in an order which won't trigger any foreign key checks. Normally this just means you have to drop the offspring tables first. Alternatively do PRAGMA foreign_keys = OFF BEGIN TRANSACTION DROP table a DROP table b ... END TRANSACTION PRAGMA foreign_keys = ON Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Improve query performance using a join
On 19 Aug 2014, at 8:25am, Paul Dillon wrote: > I was > using count(first_field) instead of selecting all the fields, can't imagine > that could be the problem. There's an optimization in SQLite which means you can do COUNT(*) and it will fetch no data at all. It's faster than doing any COUNT(specific_field). Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Long time to drop tables.
HI, Thanks for the replies. The entire thing on happening on a single desktop machine. The database is on a standard HDD. Using SQLiteStudio; sqlite version 3.7.16.1. There are some foreign keys. Autovacuum is off I think (don't know how to check, but the table size never automatically shrinks of its own accord). You can download the database here (compressed down to ~30mb). https://drive.google.com/file/d/0B22cYd3gwE-6MTdwUzczUlptdk0/edit?usp=sharing I was actually trying to delete all of these tables (this version of the database doesn't have the extra tables I created; the extra ones had no foreign keys at all). Cheers, Jonathan On 16 August 2014 20:04, Richard Hipp wrote: > On Sat, Aug 16, 2014 at 2:41 PM, Jonathan Moules < > jonathanmou...@warwickshire.gov.uk> wrote: > > > Hi List, > > More of a curiosity. > > I'm doing some general data munging and set off a query that consists > > entirely of 37 DROP TABLEs in it. The database it's running against is > > a bit less than 1GB made of about 5 million rows, and the tables being > > dropped constitute about 99% of the content. > > > > My questions is - why does it take so long? The total time > required > > to create this dataset (most of which was processing on the Python > > side) was about 11 minutes. > > > > The total time required to perform these drops is ... well I > > cancelled it at 20mins - it had deleted 20 of the 37. For that entire > > period SQLite has been reading at a rate of 170MB/s - by my maths it > > had read about 200GB! > > > > The tables don't have indexes, the settings are all whatever the > > defaults are. > > > > Any suggestions what's going on? Is this normal behavior? > > > > > I made a copy of a 2.3GB database that contained two tables that comprised > 90% of the database space. I did a DROP TABLE on each. Each DROP TABLE > took about 100 milliseconds, real-time. > > DROP TABLE bmdTileTable; > Run Time: real 0.109 user 0.052133 sys 0.043098 > DROP TABLE bmdTileTable_with_out; > Run Time: real 0.102 user 0.052688 sys 0.029648 > > Maybe you have autovacuum turned on in your database? Autovacuum makes > doing things like DROP TABLE much slower because it has to rearrange > content in order to move it all to the front of the file then truncate the > file. > > > -- > D. Richard Hipp > d...@sqlite.org > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- This transmission is intended for the named addressee(s) only and may contain confidential, sensitive or personal information and should be handled accordingly. Unless you are the named addressee (or authorised to receive it for the addressee) you may not copy or use it, or disclose it to anyone else. If you have received this transmission in error please notify the sender immediately. All email traffic sent to or from us, including without limitation all GCSX traffic, may be subject to recording and/or monitoring in accordance with relevant legislation. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Problem with Update Statement
Wycliff Mabwai wrote: > syntax near "Where". > > SQLITEcmd2.CommandText = "UPDATE RecordGrid SET > LineNumber='" & reade20.GetInt32(11) & "',self_empty_info_gender_PRect= '" > & IIf(reade20.IsDBNull(22), String.Empty, reade20.GetString(22)) & "," & > IIf(reade20.IsDBNull(23), String.Empty, reade20.GetString(23)) & "," & > IIf(reade20.IsDBNull(24), String.Empty, reade20.GetString(24)) & "," & > IIf(reade20.IsDBNull(25), String.Empty, reade20.GetString(25)) & "' WHERE > RecordGridID='"chombo"'" This is unreadable. Show the value of CommandText. Regards, Clemens ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Improve query performance using a join
Jim Callahan jim.callahan.orlando at gmail.com wrote: > 1. My condolences with those dimensions you are heading for "big> > data"/hadoop land. Heheh thanks, I am so determined not to go there! SQLite has been such a nice simple database to use, I will do anything to avoid all the complexity of those map-reduce solutions. > 2. Worry about the number of rows; that's what feeds into the big-oh: O(n). > Assuming your 150 columns translate into a 1.5k to 2k record length that > means your 300 gigabyte file must have 150 (300 gig/2k) to 200 (300 > gig/1.5k) million records. That's a lot of n for O(n). I've got about 200 million rows of 1,100 bytes each. At least it's linear. I would love to partition the rows into 4 databases and query them in parallel. But that's just a fantasy, I can live with it for now. > So, I would recommend double checking the specs of your SSD and not > necessarily making the reasonable, but not yet true, assumption that > solid-state has to be faster than mechanical. I use an SSD for other reasons. However, I've tested sustained sequential read and I am getting 550MB/s. I'm doing full table scans so it's sequential. This is on a laptop, and I am fairly sure there are no mechanical 2.5" HDDs that can transfer 200GB at 550MB/s. > One strategy that might work is to have an entirely separate (not joined) 8 > column table; develop queries (targets) on that database and then write out > the primary key of the rows you are interested in to a separate table > (CREATE TABLE AS SELECT primarykey FROM 8columntable WHERE yourquery;) and > then JOIN the row reduced table to the main table. If your rowreduced table > has millions of fewer records (primary keys) that's millions of rows (in > the main 150 column table) where the precompiled SQL query doesn't have to > be executed. Cheers for that, I'll do some testing and see how I go! Thanks, Paul ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Improve query performance using a join
Thanks to all that replied to my post and sorry for the delayed response. I had trouble joining the list and had not realised that my post went through until I tried to join the list again. Simon Slavin slavins at bigfraud.org wrote: > What you might find increases your speed is to make sure that those 8 columns > are the first 8 columns listed in your table definition (after the primary > key, if any). Each time you have SQLite read the columns for a row it has to > read the whole row up to the last column it needs. So if all the unwanted Thanks, intriguing idea. Unfortunately made no difference for my situation. I made sure the table was too large to be cached, and I was using count(first_field) instead of selecting all the fields, can't imagine that could be the problem. > Lastly, do not fall into the trap of premature optimization. You should not > be looking for your program to run "as fast as possible". You should be > > looking for it to run "acceptably fast". You can spend 20 hours of > programming to improve > runtime by 1 minute -- a minute that your users wouldn't care about because > they always go > make a cup of coffee during a run anyway. Yeah I am wary of that. The queries take about 20 minutes, and really disrupts my workflow. If I spend a few hours on this and it works, it'll save me hundreds of hours in the long run. I like to collect anecdotal performance advice, and implement it when the cost of doing so is not much more than not doing it. I have some evidence that the approach has worked for me in the past. Thanks, Paul ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users