Let's describe the algorithms we will implement.
h1 BTree methods
Insert
When we try to insert a tuple <K, V> into a tree, we start from the root page, which can be either a Leaf or a Node. In any case, we call the insert(K, V) method on the BTree, which will get a new revision that will become the new root if the insertion succeed.
BTree.insert(key, value) : (K, V) -> V
get new revision
return insert(revision, key, value)
This internal method will try to insert the tuple in the root page, and depending on the result, will either switch the root page to the newly created page, or o create a new root page containing the pivot if the old root page has been split
BTree.insert(revision, key, value) : (long, K, V) -> V
currentRoot <- get the current root page
result <- currentRoot.insert(revision, key, value)
oldValue <- null
newRoot <- null
if result is a modified page
then
newRoot = result.modifiedPage
oldValue <- result.oldValue
else newRoot <- new Node(revision)
newRoot.children[0] <- result.getLeftPage
newRoot.children[1] <- result.getRightPage
newRoot.keys[0] <- result.getPivot
roots.put(<revision, newRoot>)
currentRoot <- newRoot
return oldValue
Delete
Find
This method is used to find a specific value, given a key.
We will have to go down the tree up to the leaf that contains the value, if the key is present in the tree. We will work on the latest revision available.
Browse
Page methods
findPos
This method is used to find the position of a key in the keys. If the key is found, we will return its position as a negative value. Otherwise, we will return the position of the key which is immediately above the key we are looking for, or the latest position in the page if the new key is higher than any existing keys.
We try to make the minimal number of comparisons while searching for the key, as it's a costly operation. If we have a tree with 1 million of elements, and pages containing 16 elements, with pages 66% full, then we will have to go down 6 levels, with around 4 comparisons per page, so around 24 comparisons for each find.
Page.findPos(key) : (K) -> int
min <- 0
max <- nbElems - 1
while max > min
do
middle <- ( max + min + 1) / 2
if key == keys[middle]
then
return - (middle + 1)
if key > keys[middle]
then
min <- middle + 1
else
max <- middle - 1
done
if key < keys[max]
then
return max
else
if key == keys[max]
then
return - (max + 1)
else
return max + 1
There is a small improvment done in this algorithm : as we may find the keys before we have gone down to atomic elements, we can save some comparaison by excluding the element we just compared.