Jay A. Kreibich wrote:
> On Wed, Jun 17, 2009 at 11:52:45AM +1000, John Machin scratched on the wall:
>   
>> On 17/06/2009 6:17 AM, Hoover, Jeffrey wrote:
>>
>>     
>>> One other note, if you have a primary key whose value is continually
>>> increasing your pk index can become imbalanced and therefore
>>> inefficient.
>>>       
>> A B-tree becomes imbalanced? How so?
>>
>> http://www.sqlite.org/fileformat.html#btree_structures says: "The tree 
>> is always of uniform height, meaning the number of intermediate levels 
>> between each leaf node page and the root page is the same."
>>
>> Do you have any evidence to the contrary?
>>     
>
>   It won't become imbalanced, but if you're inserting rows with an
>   explicit INTEGER PRIMARY KEY value in a non-increasing order, the
>   tree will require sorting and re-balancing.  That takes time and
>   requires additional disk writes (and, as others have pointed out,
>   disk writes are VERY expensive due to their transactional nature).
>
>   Also, depending on just how mixed up the pattern is, you can get into
>   situations where a very large index will over-flow the default 1500
>   page cache-size.  It is well known that if you want to build an index
>   on a large table, increasing the cache size will help make that
>   process faster.  It might be true here as well.  Try setting the page
>   cache to something nice and huge, like 10x or 100x the default, and
>   see if that helps.
>
>    -j
>
>   
You are correct that a split in a B-Tree is expensive, more so if there 
are many levels.  Also the B-Tree algorithm keeps the index balanced but 
it does not prevent fragmentation.  An addition to the B-Tree logic to 
perform node merges where possible will limit fragmentation and limit 
the creation of unnecessary levels.

B-Trees which are expected to grow substantially can achieve a speed 
increase by partially filling the nodes so that  insertions can occur 
without forcing a split.

My guess is if you have a large number of Sqlite insertions you might 
find that to drop the index and re-raise it when the insertions are 
complete will be faster.

Note that the Sqlite rowids are organized as a B-Tree, but because they 
are consecutive numbers they make a well organized index.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to