Re: [sqlite] Terrible performance for one of our tables

Marcus Grimm Wed, 19 Nov 2008 00:25:54 -0800

Hi,

do you have a autoincrement primary key in that table ?
if not, try to add one giving sqlite the chance to
query an internal index rather than the table itselve.
I don't see why sqlite should read all the data from that table.


I've read somewhere that count(*) may scan the hole table,
if you allready have an primary key (autoincrement) you may
try with "SELECT COUNT(Primarykey) FROM t".

Or you may try by SELECT ID FROM t and step throu the results
and count manually.

I'm interested in the results.

Hope this helps.

Marcus

Jens Miltner wrote:
> Hi,
> 
> we're seeing terrible performance problems when fetching data from one  
> of our tables:
> The table contains roughly 1.2 Million rows and a plain "SELECT  
> COUNT(*) FROM t" query takes 8 minutes to finish.
> The table contains 10 fields, but the records average to about 100  
> Bytes of data total. The largest records are ~1k of data, but there  
> are only  a few hundred records with more than 256 Bytes of data.
> 
> Another table in the same database contains over 5 Million records  
> (and probably an average of a few hundred Bytes per record). Counting  
> rows in this table only takes about 40 seconds.
> 
> (All times taken from first run and they're reproducible on several  
> machines).
> 
> 
> When looking at a sample taken while sqlite is executing the COUNT(*)  
> query, almost all the time is spent inside read() calls.
> The average I/O rate is about 500kB - 800kB / seconds read  
> performance. When reading from other databases / other tables, I can  
> easily get 20-30 MB/s overall read performance.
> So, I can only conclude that for some reason, there seems to be some  
> terrible I/O behavior for the data in this specific table.
> 
> Running fs_usage shows that there are almost no consecutive pages  
> being read (i.e. the file offsets are not consecutive), so I suspect  
> that for some reason, this table is fragmented all across the 3 GB  
> database file, which may explain the slow overall read performance.
> 
> Now, there are two things that happened to this table, which didn't  
> happen to the other big table:
> 
> (1) we've updated the schema by appending a couple of columns
> (2) when data is updated, it's updated by first deleting a bunch of  
> rows and then re-inserting the new data. This happens once or twice a  
> day for almost all entries, thus the
> 
> Would either (1) or (2) lead to heavy fragmentation of this kind?
> 
> Vacuuming the database does resolve the performance issue, but  
> vacuuming that specific database took about 75 minutes (MacPro @  
> 3GHz), so it's not something we can do frequently (especially, since  
> the database is used in a server product that has 24/7 uptime) :(
> 
> Is there anything we could do to avoid the table fragmentation in the  
> first place (write queries in a different way, avoid deletions/re- 
> insertions, etc.)?
> (Note that I'm not talking about reclaiming free space here - that's  
> not really an issue for us. We need to avoid the performance hit we've  
> seen which IMHO is caused by the pages for a single table being  
> scattered all over the database file, thus requiring the paging code  
> to jump a lot in the file).
> 
> 
> Thanks,
> -jens
> 
> 
> BTW: this is SQLite 3.6.3, but at least running the COUNT queries is  
> not any faster with 3.5.x. Tests run on Mac OS X.
> 
> 
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 

_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Terrible performance for one of our tables

Reply via email to