Hi, we're seeing terrible performance problems when fetching data from one of our tables: The table contains roughly 1.2 Million rows and a plain "SELECT COUNT(*) FROM t" query takes 8 minutes to finish. The table contains 10 fields, but the records average to about 100 Bytes of data total. The largest records are ~1k of data, but there are only a few hundred records with more than 256 Bytes of data.
Another table in the same database contains over 5 Million records (and probably an average of a few hundred Bytes per record). Counting rows in this table only takes about 40 seconds. (All times taken from first run and they're reproducible on several machines). When looking at a sample taken while sqlite is executing the COUNT(*) query, almost all the time is spent inside read() calls. The average I/O rate is about 500kB - 800kB / seconds read performance. When reading from other databases / other tables, I can easily get 20-30 MB/s overall read performance. So, I can only conclude that for some reason, there seems to be some terrible I/O behavior for the data in this specific table. Running fs_usage shows that there are almost no consecutive pages being read (i.e. the file offsets are not consecutive), so I suspect that for some reason, this table is fragmented all across the 3 GB database file, which may explain the slow overall read performance. Now, there are two things that happened to this table, which didn't happen to the other big table: (1) we've updated the schema by appending a couple of columns (2) when data is updated, it's updated by first deleting a bunch of rows and then re-inserting the new data. This happens once or twice a day for almost all entries, thus the Would either (1) or (2) lead to heavy fragmentation of this kind? Vacuuming the database does resolve the performance issue, but vacuuming that specific database took about 75 minutes (MacPro @ 3GHz), so it's not something we can do frequently (especially, since the database is used in a server product that has 24/7 uptime) :( Is there anything we could do to avoid the table fragmentation in the first place (write queries in a different way, avoid deletions/re- insertions, etc.)? (Note that I'm not talking about reclaiming free space here - that's not really an issue for us. We need to avoid the performance hit we've seen which IMHO is caused by the pages for a single table being scattered all over the database file, thus requiring the paging code to jump a lot in the file). Thanks, -jens BTW: this is SQLite 3.6.3, but at least running the COUNT queries is not any faster with 3.5.x. Tests run on Mac OS X. _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users