Follow-up on this issue.

This morning I used a JetBrains product called dotTrace to analyze my
application's resource usage. I loaded a 7 million rows into a table.  When
I simply loaded the table, it took 12 minutes.  Pretty impressive.  When I
added 'Unique' to one of the fields definitions, the load time jumped to 90
minutes.

Naturally, most all the cpu consumption was cause by the various stream and
data readers in the application.  It's easy to understand, also, that
applying unique is going to cause addition hits against the database
(ensuring the unique field value does not already exist.)  The attached
image is the dotTrace output for this run.

What I did next was employ an internal hash in my application.  I attempt
to insert the unique value into the hash and catch the exception if it
exists.  The run time is now 14 minutes for the 10 million records.


On Thu, Feb 9, 2012 at 8:46 AM, Black, Michael (IS)
<michael.bla...@ngc.com>wrote:

> I think you may find you're running into buffer cache limits (not sqlite
> but OS limits).
>
>
>
> So the 1st third all fits into buffer cache.  Once it starts committing to
> disk things slow down a LOT.
>
>
>
> Since you're not showing an real times it's hard to say you are any slower
> than anybody else.
>
>
>
> I saw similar behavior on a project I was doing and it all boiled down to
> disk write speed once things started going to disk.
>
>
>
> Michael D. Black
>
> Senior Scientist
>
> Advanced Analytics Directorate
>
> Advanced GEOINT Solutions Operating Unit
>
> Northrop Grumman Information Systems
>
> ________________________________
> From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org]
> on behalf of Don V Nielsen [donvniel...@gmail.com]
> Sent: Thursday, February 09, 2012 8:14 AM
> To: General Discussion of SQLite Database
> Subject: EXT :Re: [sqlite] Inserts get slower and slower
>
> I've noticed a similar thing happening.  The first 1/3rd loads quickly; the
> remain 2/3rds stagnates.  It appears that there is some kind of bottleneck
> happening.  I thought it was the SAN.
>
> My application begins a transaction, does all its inserts, and then
> commits.  There could be millions in the transaction.  Would it be better
> processing to commit in batches, say 250m or 500m?
>
> Now's the time for me to make these changes, as the application is being
> prep'd for production.
>
> dvn
>
> On Wed, Feb 8, 2012 at 4:29 PM, Simon Slavin <slav...@bigfraud.org> wrote:
>
> >
> > On 8 Feb 2012, at 10:22pm, Oliver Peters wrote:
> >
> > > It's the Primary Key that you're using cause for every INSERT it is
> > checked if unix_time is already present in a record.
> > >
> > > So the question is if you really need unix_time as a PK
> >
> > If you're batching your INSERTs up into transactions, try doing a VACUUM
> > after each COMMIT.
> >
> > Simon.
> > _______________________________________________
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to