----- Original Message ----- From: "John Stanton" <[EMAIL PROTECTED]>
To: <sqlite-users@sqlite.org>
Sent: Wednesday, March 28, 2007 7:42 AM
Subject: Re: [sqlite] CREATE INDEX performance


I retract the overflow page theory on your compelling evidence and now understand better what it is doing after looking at the VDBE. By building an index by successive insertions the tree is splitting and balancing as it grows, and that is expensive. Double the size of the key and you get twice as many leaf nodes and quite a few more interior nodes.

If the keys order is very random the keys are being inserted all over the tree which is slow. Presenting the keys in sorted sequence should cut back on the fragmentation and will very likely build a more compact tree by ensuring that each leaf node is filled.

An optimization for building such a tree would be to extract the keys, sort them and build the tree bottom up. By avoiding all splitting and jumping around the tree it should be an order of magnitude faster or better. I took a quick look at the code and got the impression that a fast index option could be built by a motivated user as a seperate program and might be a handy tool for people managing very large Sqlite databases. Cutting back a 20 hour run to 1-2 hours can be a big win.

Sqlite is something of a victim of its success. The embedded lite database is being asked to perform enterprise level tasks which stretch its envelope.


How do we do sorting prior to indexing? If it is the initial table it's OK, we can sort it before insert. But for existing table, how do we do that?

regards,
Radzi.


-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to