----- Original Message -----
From: "John Stanton" <[EMAIL PROTECTED]>
To: <sqlite-users@sqlite.org>
Sent: Wednesday, March 28, 2007 7:42 AM
Subject: Re: [sqlite] CREATE INDEX performance
I retract the overflow page theory on your compelling evidence and now
understand better what it is doing after looking at the VDBE. By building
an index by successive insertions the tree is splitting and balancing as
it grows, and that is expensive. Double the size of the key and you get
twice as many leaf nodes and quite a few more interior nodes.
If the keys order is very random the keys are being inserted all over the
tree which is slow. Presenting the keys in sorted sequence should cut
back on the fragmentation and will very likely build a more compact tree
by ensuring that each leaf node is filled.
An optimization for building such a tree would be to extract the keys,
sort them and build the tree bottom up. By avoiding all splitting and
jumping around the tree it should be an order of magnitude faster or
better. I took a quick look at the code and got the impression that a
fast index option could be built by a motivated user as a seperate program
and might be a handy tool for people managing very large Sqlite databases.
Cutting back a 20 hour run to 1-2 hours can be a big win.
Sqlite is something of a victim of its success. The embedded lite
database is being asked to perform enterprise level tasks which stretch
its envelope.
How do we do sorting prior to indexing? If it is the initial table it's OK,
we can sort it before insert. But for existing table, how do we do that?
regards,
Radzi.
-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------