Jens Miltner wrote:
> Am 19.11.2008 um 13:05 schrieb D. Richard Hipp:
> 
>> On Nov 19, 2008, at 3:08 AM, Jens Miltner wrote:
>>
>>> Hi,
>>>
>>> we're seeing terrible performance problems when fetching data from  
>>> one
>>> of our tables:
>>> The table contains roughly 1.2 Million rows and a plain "SELECT
>>> COUNT(*) FROM t" query takes 8 minutes to finish.
>> That's because "SELECT count(*) FROM t" has to read and count every
>> row in the table - all 1.2 million rows.  And this involves reading
>> all 1.2 million rows from disk.  If each row is 100 bytes in size,
>> that involves reading 120MB.
>>
>> An SQLite database does *not* maintain the number of rows in a table
>> as separate metadata.  Many client/server database engines do maintain
>> such metadata.  When I was designing the SQLite file format, I
>> deliberately choose to omit such metadata because including it would
>> slow down inserts and deletes.
> 
> Yes, I know about this and it's usually not a problem.
> It only turned into a problem for this specific table.
> 
> As I mentioned in my original e-mail, after vacuuming the database,  
> the time to run the COUNT(*) query went down to about 5 seconds, so it  
> looks like somehow the database became fragmented enough to seriously  
> hit the performance of the serial row access...
> 
> BTW: we don't actually run the COUNT(*) query, but we see major  
> performance hits after a while with this table, so I figured I'd run  
> the most simple query first and found that the time needed for  
> COUNT(*) goes through the ceiling (from something like 5 seconds to  
> almost 9 minutes -  roughly a 100x time increase) after having deleted  
> and re-inserted rows in that table for a while.
> 
> Any ideas why there would be such a _huge_ performance hit after  
> deleting and re-inserting records for a while?
> Anything we can do to avoid this kind of performance-penalty- 
> fragmentation (other than regularly vacuuming)?
> 
> Thanks,
> </jum>
>
Fragmentation is inherent when there are deletions and insertions, and 
involves orders of magnitude increases in access times.  Just look at 
the example of before and after when a badly fragmented disk is 
defragged to see the effect.

Many years ago we learned from practical experience to avoid such 
situations when designing applications.  High availability applications 
cannot tolerate an increase in chaos during operation.

> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to