Let's describe the algorithms we will implement.
BTree methods
insert() method
When we try to insert a tuple <K, V> into a tree, we start from the root page, which can be either a Leaf or a Node. In any case, we call the insert(K, V) method on the BTree, which will get a new revision that will become the new root if the insertion succeed.
BTree.insert(key, value) : (K, V) -> V
get new revision
return insert(revision, key, value)
This internal method will try to insert the tuple in the root page, and depending on the result, will either switch the root page to the newly created page, or to create a new root page containing the pivot if the old root page has been split
BTree.insert(revision, key, value) : (long, K, V) -> V
currentRoot <- get the current root page
result <- currentRoot.insert(revision, key, value)
oldValue <- null
newRoot <- null
if result is a modified page
then
newRoot <- result.modifiedPage
oldValue <- result.oldValue
else newRoot <- new Node(revision)
newRoot.children[0] <- result.getLeftPage
newRoot.children[1] <- result.getRightPage
newRoot.keys[0] <- result.getPivot
roots.put(<revision, newRoot>)
currentRoot <- newRoot
return oldValue
delete() method
Deleting an element from a leaf is pretty simple : we just have to find its position, and if found, to copy the page, and to remove the element from it. The problem is that doing so may result in a leaf with less than N/2 elements, which is not allowed for a leaf, except if it's the root page. So we have to either add an element into the page to make it contains N/2 elements, or to merge it with a page that contains N/2 elements. Then if we merge two pages, we will have to remove an element in the parent page, with the exact same consequences. At the end, the whole tree may be impacted.
Here, we can't really distinguish the algorithm used to delete an element from a Leaf from the one applied on a Node. We will study the global algorithm, and then describe the specific algorithms for a Leaf and for a Node.
General algorithm
There are a few cases we should consider :
- The page is a Leaf but is also the root page : simply remove the element
- The page has more than N/2 elements :
- The element is on the left : remove it and return the next element to the parent, as the parent's key will change.
- The element is on the middle of the page : simply remove it
- The page has exactly N/2 elements
- There is a sibling with more than N/2 elements
- The sibling has the same parent : borrow the element from it (and update the parent accordingly if we get the element from the right sibling
- The sibling has a different parent : update the parent's sibling up to the page which does not have the moved key on its left
- Both siblings have N/2 elements : select the sibling which has the same parent page, and merge it with the current page. This is the most complex case, as the resulting parent's page may contain less than N/2 elements, and we have to recursively apply the same algorithm. If we reach the root, which has only one element, then we will have to remove the root page and use the merged pages as the new root.
Let's see how it translates using some pseudo-code
currentRoot <- get the current root page
if the currenRoot is a leaf
then remove the key from it and return
result <- currentRoot.delete( key )
if the result is a mergedPage
then
if the currentRoot has more than one element
then remove the old pivot and update the root page references to the child
else
the mergedPage is the new root
else
if the result has a modified pivot
then
change the pivot in the root page
find() method
This method is used to find a specific value, given a key.
We will have to go down the tree up to the leaf that contains the value, if the key is present in the tree. We will work on the latest revision available.
BTree.find(key) : K -> V
return currentRoot.find( key )
browse() method