Yes. I agree.

What about second approach.

create table if not exists emp (id integer primary key autoincrement, col1
text, col2 text); //without unique.

I tried with insert or ignore into emp (col1, col2) values ('a', 'b');

I noticed, this is fast.

just to know, what might be the reason?


On Tue, Jan 10, 2012 at 2:36 PM, Simon Slavin <slav...@bigfraud.org> wrote:

>
> On 10 Jan 2012, at 8:37am, Durga D wrote:
>
> > I noticed, with first approach, huge performance hit when database grows.
> > for ex: database has 500,000 records. now insert 25,000 within a
> > transaction. It takes lot of time when compared to insert 25,000 records
> > with empty database.
> >
> > approximately 15 times slower. because of unique(col1, col2).
>
> This makes some sense.  Every INSERT command involves SQLite looking
> through the records which are already in the table.  So a new insert when
> there are only 100 rows in the table is going to be pretty quick, but an
> insert when there are 500,000 rows in the table will involve lots of work.
>
> SQLite has to look up the data of the new record in the index it makes of
> (col1, col2) so it can find out whether the new row is a duplicate of an
> existing row.  So it has to do 25,000 searches of a balanced tree.  And if
> the tree has 500,000 records in it that means it's doing around ... hmm,
> it's over 3/4 of 1020, call it 765 ... tests for each search of the tree,
> 765 test for each INSERT command.
>
> So inserting 25,000 new rows when you already have 500,000 rows involves
> more than 25,000 * 765 string comparisons.  Call it 19 million string
> comparisons.  Plus, of course, all the other work involved in messing with
> the database.  Since you know how long your strings are you can get an idea
> of how much data SQLite is having to handle to do this.
>
> > I did with transactions for every 25,000 records.
>
> That is a reasonable way to do it.  People may be able to recommend
> PRAGMAs which will speed this up a bit but what you're doing really does
> involve a lot of work for the computer to do.
>
> Simon.
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to