Joe Wilson <[EMAIL PROTECTED]> wrote:
--- [EMAIL PROTECTED] wrote:
Joe Wilson <[EMAIL PROTECTED]> wrote:
The reason why I asked is that I haven't had much luck with sqlite3 performance for databases larger than the size of RAM on my machine
regardless of PRAGMA settings.
This is probably do to the cache locality problem.  We know how
to fix this, Joe.  Would you like to have a go at it?
I think setting aside contiguous pages in the file for exclusive use by each btree would help improve locality of reference on disk.

For example, let A, B and C represent in-use pages of 3 btrees and a, b and c represent free pages corresponding to the same btrees:


Is this what you had in mind in your post?


The problem is when inserting into large database that is
indexed, the values being indexed are randomly distributed.
So with each insert, SQLite has to seek to a new random
place in the file to insert the new index entry there.
It does not matter that pages of the index are not in
consecutive order.  What matters is that each insertion
is into a different place and that the places are randomly
distributed over a large file - larger than will fit in cache.
In this situation, each new entry probably needs to go on a page that is not in cache, and hence a real disk
seek and read is required (as opposed to simply reading
the value out of cache) when inserting each new entry.  Disk
seeks and reads are much, much slower than disk cache hits,
which really slows down insertion performance.

If you do many inserts such that the indexed values are
in sorted order, then most inserts will go on the same page
as the previous.  The previous page is already in cache
so it doesn't need to be read from disk.  It has also
already been journaled, so no excess writing is required.
Disk I/O is greatly reduced. Things go much, much faster.
The problem is that you really have the luxury of being
able to insert entries in sorted order.  And if you are
indexing multiple columns, it is impossible to sort
the entries on both columns at once.

The usual way to work around this problem is to only do random
inserts into indices which are small enough to fit into
your disk cache.  Suppose you start inserting into index A.
Once A gets too big to fit entirely in cache, stop inserting
into it and start inserting into B.  Once B gets to be the
cache size, merge A and B together into a new index C.
The merge operation requires reading A and B straight through
from beginning to end once.  This is a lot of disk I/O but
it is still much faster than jumping around within the file
reading bits here and there.  After creating C, reset A and
B back to being empty (since all records have been transfered
into C).  Start inserting into A again until it fills up.
Then fill up B again.  Merge A and B into D, then merge C and D
into E.  Reset A, B, C, and D.  Keep doing this, merging
smaller indices into larger indices, until you insert all
the records you need to insert.  Then make a single pass
through all of the smaller indices and merge them all
together into a single big index Z.  Z becomes the new
index used for search operations on the database.

The algorithm above should probably go into SQLite to construct an index as part of the CREATE INDEX
statement.  When populating a database will a large
amount of data, first put all of the data into an
unindexed table.  This will be very fast because each
new entry goes at the end of the table (and also at the
end of the file.)  After all data is in place, issue
the CREATE INDEX statements to index any fields you
think need indexing.  The CREATE INDEX statement has
to populate the index.  The current algorithm is to
scan through the original table and create and insert
index entries one by one as they are encountered.  I am
proposing that you substitute the algorithm outlined in
the previous paragraph in place of the current algorithm.

D. Richard Hipp <[EMAIL PROTECTED]>

I experimented with that problem of building B-Tree indices on tables and discovered that to first build the table then build a list of keys into a file, sort the file using an iterative combination of quicksorts and merges then build the index in an optimal manner since the space needed for non-leaf nodes can be calculated and reserved, produced good performance on large tables even in a restricted memory environment.

Using this approach the data table and temporary files are only ever accessed sequentially and the "locality of reference" situation sidestepped. If there are multiple disk drives available the intermediate files can be on another drive to further limit head movement.

Ensuring that interior nodes were contiguous seemed to be a winning strategy. Only filling nodes to say 80% also delivered an index which would accept quite a lot of inserts before fragmenting with node splits and implementing node merging if possible instead of a split reduced fragmentation further.

With Sqlite to drop indices, insert a large number of rows into a table then rebuild the indices using a fast method like that above should be much faster than performing a large number of index insertions and have the added advantage of delivering an optimally organized index.

To unsubscribe, send email to [EMAIL PROTECTED]

To unsubscribe, send email to [EMAIL PROTECTED]

Reply via email to