Bulk copyPage added by Emmanuel LécharnyWe want to be able to inject a lots of data into a BTree without to have to inster the data one by one, because it's extremely expensive, as it will leads to many tree reorganization. There is a simple algorithm that allows to inject a load of data without having to spend time in spurious reorganization. Creating a new treeLet's suppose we have an empty tree first. We have a sorted set of M elements (if the data aren't sorted, then this won't work). What we just have to do is to consider that all those data are part of the leaves. If each page can contain N element, then we create as many leaves as we need to store the M elements (so it's M/N). One special case is when the last leaf does not contain at least N/2 element, in this case we 'borrow' some elements from the previous leaf. This is for the leaves. Now, we pick the left element from each of the leaves but the first one. All those keys will be store din the parent pages. Same here, we create as many pages as necessary to store the M/N keys. Once done, we do the same thing until we just have enough element to fit in a single page, which will be the root page. The index is now ready to be created, we just have to store the pages with the links. Adding a bulk of data in an existing treeIn some case, it might be less expensive to create a new tree by gathering the existing data with the new ones. We just merge both sets of data and apply the very same algorithm than seen above. Doing so has one extra advantage : we will compact the btree which might otherwise be full of empty slots in both nodes and leaves. The only problem is that it's a slow operation, as we have to duplicate the complete existing tree.
Change Notification Preferences
View Online
|
Add Comment
|
