Yes. I agree. What about second approach.
create table if not exists emp (id integer primary key autoincrement, col1 text, col2 text); //without unique. I tried with insert or ignore into emp (col1, col2) values ('a', 'b'); I noticed, this is fast. just to know, what might be the reason? On Tue, Jan 10, 2012 at 2:36 PM, Simon Slavin <slav...@bigfraud.org> wrote: > > On 10 Jan 2012, at 8:37am, Durga D wrote: > > > I noticed, with first approach, huge performance hit when database grows. > > for ex: database has 500,000 records. now insert 25,000 within a > > transaction. It takes lot of time when compared to insert 25,000 records > > with empty database. > > > > approximately 15 times slower. because of unique(col1, col2). > > This makes some sense. Every INSERT command involves SQLite looking > through the records which are already in the table. So a new insert when > there are only 100 rows in the table is going to be pretty quick, but an > insert when there are 500,000 rows in the table will involve lots of work. > > SQLite has to look up the data of the new record in the index it makes of > (col1, col2) so it can find out whether the new row is a duplicate of an > existing row. So it has to do 25,000 searches of a balanced tree. And if > the tree has 500,000 records in it that means it's doing around ... hmm, > it's over 3/4 of 1020, call it 765 ... tests for each search of the tree, > 765 test for each INSERT command. > > So inserting 25,000 new rows when you already have 500,000 rows involves > more than 25,000 * 765 string comparisons. Call it 19 million string > comparisons. Plus, of course, all the other work involved in messing with > the database. Since you know how long your strings are you can get an idea > of how much data SQLite is having to handle to do this. > > > I did with transactions for every 25,000 records. > > That is a reasonable way to do it. People may be able to recommend > PRAGMAs which will speed this up a bit but what you're doing really does > involve a lot of work for the computer to do. > > Simon. > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users