One approach might be to split the big, monolithic table into some number of hash buckets, where each 'bucket' is separate table. When doing a search, the program calculates the hash and accesses reads only the bucket that is needed.
This approach also has the potential for allowing multiple databases, where tables would be spread across the different databases. The databases could be spread across multiple drives to improve performance. *** Doug Fajardo -----Original Message----- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Matthew O'Keefe Sent: Wednesday, June 24, 2009 12:21 PM To: sqlite-users@sqlite.org Subject: [sqlite] very large SQLite tables We are using SQLite for indexing a huge number (i.e., 100 million to 1 billion) of key pairs that are represented by an 88-byte key. We are using a single table with a very large number of rows (one for each data chunk), and two columns. The table has two columns. One is of type ³text² and the other is type ³integer². > > The table is created with: > > CREATE TABLE chunks > ( > name text primary key, > pid integer not null ); As expected, as the table grows, the underlying B-tree implementation for SQLite means that the number of disks accesses to (a) find, and (b) add a chunk, grows larger and larger. We¹ve tested up to 20 million chunks represented in the table: as expected performance exponentially decreases as the number of table entries grows. We wanted to post to the mailing list to see if there are any obvious, first-order things we can try to improve performance for such a large table. We really appreciate the efforts of the SQLite developer community! Matt O¹Keefe sqlite-users@sqlite.org _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users