Are you wrapping your data dump into a "BEGIN" "COMMIT"?
Also...you mention using a select...I assume you are testing first to see if you need to update or insert? You may want to look at the REPLACE clause and see if you can use that. Here's a little benchmark program I wrote to test a super-simple one-integer insert to test sql speed and commit interval behavior. Running it on my machine (Windows XP64 8-core 3Ghz gcc (tdm64-1) 4.5.1) I go from 320M inserts per second to 361K inserts per second when no begin/commit occurs. With WAL mode turned on it only drops to 5.9M inserts per second. D:\SQLite>batch 5000 5000 320000000.0 inserts per sec D:\SQLite>batch 5000 2500 320000000.0 inserts per sec D:\SQLite>batch 5000 1200 160000000.0 inserts per sec D:\SQLite>batch 5000 600 160000000.0 inserts per sec D:\SQLite>batch 5000 300 106666666.7 inserts per sec D:\SQLite>batch 5000 150 53333333.3 inserts per sec D:\SQLite>batch 5000 75 32000000.0 inserts per sec D:\SQLite>batch 5000 40 17777777.8 inserts per sec D:\SQLite>batch 5000 20 10000000.0 inserts per sec D:\SQLite>batch 5000 10 3333333.3 inserts per sec D:\SQLite>batch 5000 5 2406015.0 inserts per sec D:\SQLite>batch 5000 2 526315.8 inserts per sec D:\SQLite>batch 5000 1 360766.6 inserts per sec #include <stdio.h> #include <windows.h> #include <winbase.h> #include <time.h> #include "sqlite3.h" double elapsed() { double t1; struct timeval tv; static long base_seconds; gettimeofday(&tv,NULL); if (base_seconds==0) base_seconds=tv.tv_sec; return (tv.tv_sec-base_seconds)+tv.tv_usec/(double)1000000; } int main(int argc, char *argv[]) { sqlite3 *db; sqlite3_stmt *stmt=NULL; int rc; int n=0; int nrec=0; int interval=0; double t1; if (argc != 3) { fprintf(stderr,"Usage: %s nrecords commit_interval\n",argv[0]); exit(-1); } nrec=atoi(argv[1]); interval=atoi(argv[2]); remove("batch.db"); rc=sqlite3_open("batch.db",&db); if (rc != SQLITE_OK) { puts(sqlite3_errmsg(db)); } sqlite3_exec(db,"create table t (i integer)",NULL,NULL,NULL) // turn on WAL mode if you want to test it //sqlite3_exec(db,"pragma journal_mode=WAL",NULL,NULL,NULL); sqlite3_prepare_v2(db,"insert into t values(?)",-1,&stmt,NULL); t1=elapsed(); if (interval != 1) rc=sqlite3_exec(db,"begin",NULL,NULL,NULL); while(n < nrec) { ++n; if (interval != 1 && (n% interval)==0) { sqlite3_exec(db,"commit",NULL,NULL,NULL); sqlite3_exec(db,"begin",NULL,NULL,NULL); } sqlite3_bind_int(stmt,n,1); rc = sqlite3_step(stmt); if (rc != SQLITE_DONE) { puts(sqlite3_errmsg(db)); } } sqlite3_exec(db,"commit",NULL,NULL,NULL); printf("%.1f inserts per sec\n",nrec/((elapsed()-t1)/CLOCKS_PER_SEC)); sqlite3_exec(db,"commit",NULL,NULL,NULL); sqlite3_close(db); } Michael D. Black Senior Scientist NG Information Systems Advanced Analytics Directorate ________________________________________ From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on behalf of Thomas Fjellstrom [tfjellst...@strangesoft.net] Sent: Friday, February 11, 2011 9:50 PM To: Teg Cc: General Discussion of SQLite Database Subject: EXT :Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected On February 11, 2011, Teg wrote: > Hello Thomas, > > I download off the net at between 400Mbps to 1 Gbps and have to pack > the data away into a Sqlite DB. I use an intermediate buffer on disk > and a background thread to write the data to the DB. Obviously, I > couldn't keep up if the data was continuous but, the lulls in between > the peak data rate times give me time to pack it away. Even if I get > an extended burst of data, it'll eventually make it in. If I was you, > I'd start by having something service the interface and buffer the > packet data and something else pulling the packet data from the buffer > and feeding it to Sqlite. Then you won't be dropping packets. The program is split into two separate threads, one is dedicated to just capturing packets off the network, and storing them in a circular buffer in memory. The second thread is dedicated to parsing the packets, calculating stats, and occasionally flushing data to storage. What normally happens, is as hosts haven't been seen for a while, they get saved to the db, and are removed from the in memory hash table. But that doesn't happen for local hosts, the fake host used to capture traffic totals, or any external hosts that keep showing up. And temporarily I've added some code to auto save all in memory hosts to the database, as I'm seeing some instability in the interface with sqlite, causing the process to loose information once in a while. I've seen numbers on the internet stating that sqlite is super fast, should be able to handle tens of thousands of inserts or updates in a second or two, even in sync mode. So I'm a bit stumped as to why its performance is so low in my case. One time, I saw it take 5 seconds to flush almost 3k rows back to the db. No hosts are actually ever removed from the database, which has about 120k hosts in it by now, totaling up to a file size of around 7-8MB total. So its not a lot of data, and the queries are pretty simple. So I really am stumped. > > T > > Friday, February 11, 2011, 6:49:16 PM, you wrote: > > TF> Hi, I have a small problem with a program I've developed. > > TF> It basically captures packets from a network device on linux and stores > TF> transfer stats on each host seen. To store the stats long term I > decided to TF> use sqlite, and when the program saves the stats every few > minutes, it takes TF> about 4-5 seconds, and if I have the synchronous > pragma turned off, it takes TF> 1-2 seconds. > > TF> These are the relevant sql commands: > > TF> CREATE TABLE hosts (id INTEGER PRIMARY KEY,\ > TF> address INTEGER UNIQUE, \ > TF> first_seen INTEGER DEFAULT CURRENT_DATETIME, \ > TF> last_on INTEGER DEFAULT CURRENT_DATETIME, \ > TF> last_off INTEGER, \ > TF> rx_bytes INTEGER, \ > TF> tx_bytes INTEGER); > > TF> UPDATE hosts SET last_on = ?, last_off = ?, rx_bytes = ?, tx_bytes = ? > WHERE id TF> = ? > TF> INSERT INTO hosts ( address, first_seen, last_on, rx_bytes, tx_bytes ) > VALUES ( TF> ?, ?, ?, ?, ? ) > > TF> The stats saving code will interleave a bunch of UPDATEs and INSERTs, > inside a TF> single transaction, maybe that has something to do with it, > I'm not sure. > > TF> 1s to update them all isn't too bad, but it still can mean I'm > potentially TF> dropping packets, which I'd really rather not do. > > TF> Thanks. -- Thomas Fjellstrom tfjellst...@strangesoft.net _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users