[CONF] Apache Labs > Algorithms

confluence Fri, 15 Jun 2012 06:28:28 -0700

     Algorithms
    Page edited by             Emmanuel Lécharny
    
        

                         Changes (4)
                                 
    

                    
    
            ...

            h1. Insert 
 

            When we try to insert a tuple <K, V> into a tree, we start from the root page, which can be either a Leaf or a Node. In any case, we call the insert(K, V) method on the BTree, which will get a new revision that will become the new root if the insertion succeed. 
 

            {code:title=void insert(K, V)|borderStyle=solid} 

            BTree.insert(key, value) : (K, V) -> V 
  get new revision 
  return insert(revision, key, value) 

            {code} 
 

            This internal method will try to insert the tuple in the root page, and depending on the result, will either switch the root page to the newly created page, or o create a new root page containing the pivot if the old root page has been split 

             

            {code:title=void insert(revision, K, V)|borderStyle=solid} 
BTree.insert(revision, key, value) : (long, K, V) -> V 
  currentRoot <- get the current root page 
  result <- currentRoot.insert(revision, key, value) 
  oldValue <- null 
  newRoot <- null 
 
  if result is a modified page 
    then  
      newRoot = result.modifiedPage 
      oldValue <- result.oldValue 
    else // It's a split page 
      newRoot <- new Node(revision) 
      newRoot.children[0] <- result.getLeftPage 
      newRoot.children[1] <- result.getRightPage 
      newRoot.keys[0] <- result.getPivot 
 
  roots.put(<revision, newRoot>) 
  currentRoot <- newRoot 
  return oldValue 
{code} 
 

            h1. Delete 
 

            ...

    
            
    
                            Full Content
                    
        Let's describe the algorithms we will implement.

h1 BTree methods
Insert

When we try to insert a tuple <K, V> into a tree, we start from the root page, which can be either a Leaf or a Node. In any case, we call the insert(K, V) method on the BTree, which will get a new revision that will become the new root if the insertion succeed.

void insert(K, V)

BTree.insert(key, value) : (K, V) -> V
  get new revision
  return insert(revision, key, value)


This internal method will try to insert the tuple in the root page, and depending on the result, will either switch the root page to the newly created page, or o create a new root page containing the pivot if the old root page has been split

void insert(revision, K, V)

BTree.insert(revision, key, value) : (long, K, V) -> V
  currentRoot <- get the current root page
  result <- currentRoot.insert(revision, key, value)
  oldValue <- null
  newRoot <- null

  if result is a modified page
    then 
      newRoot = result.modifiedPage
      oldValue <- result.oldValue
    else // It's a split page
      newRoot <- new Node(revision)
      newRoot.children[0] <- result.getLeftPage
      newRoot.children[1] <- result.getRightPage
      newRoot.keys[0] <- result.getPivot

  roots.put(<revision, newRoot>)
  currentRoot <- newRoot
  return oldValue



Delete

V delete(K)





Find

This method is used to find a specific value, given a key. 

We will have to go down the tree up to the leaf that contains the value, if the key is present in the tree. We will work on the latest revision available.

V find(K)




Browse

cursor<V> browse(K)




Page methods

findPos

This method is used to find the position of a key in the keys. If the key is found, we will return its position as a negative value. Otherwise, we will return the position of the key which is immediately above the key we are looking for, or the latest position in the page if the new key is higher than any existing keys.

We try to make the minimal number of comparisons while searching for the key, as it's a costly operation. If we have a tree with 1 million of elements, and pages containing 16 elements, with pages 66% full, then we will have to go down 6 levels, with around 4 comparisons per page, so around 24 comparisons for each find.

int findPos(K)

Page.findPos(key) : (K) -> int
  min <- 0
  max <- nbElems - 1

  while max > min
    do
      middle <- ( max + min + 1) / 2

      if key == keys[middle]
        then
          return - (middle + 1)

      if key > keys[middle]
        then
          min <- middle + 1
        else
          max <- middle - 1
    done

  if key < keys[max]
    then 
      return max
    else 
      if key == keys[max]
        then 
          return - (max + 1)
        else 
          return max + 1



There is a small improvment done in this algorithm : as we may find the keys before we have gone down to atomic elements, we can save some comparaison by excluding the element we just compared.
    
        
        
            Change Notification Preferences
        
        View Online
        |
        View Changes
                |
        Add Comment
            







---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
 ...
 h1. Insert
+When we try to insert a tuple <K, V> into a tree, we start from the root page, which can be either a Leaf or a Node. In any case, we call the insert(K, V) method on the BTree, which will get a new revision that will become the new root if the insertion succeed.
 {code:title=void insert(K, V)|borderStyle=solid}
+BTree.insert(key, value) : (K, V) -> V 
  get new revision 
  return insert(revision, key, value)
 {code}
+This internal method will try to insert the tuple in the root page, and depending on the result, will either switch the root page to the newly created page, or o create a new root page containing the pivot if the old root page has been split
+{code:title=void insert(revision, K, V)|borderStyle=solid} 
BTree.insert(revision, key, value) : (long, K, V) -> V 
  currentRoot <- get the current root page 
  result <- currentRoot.insert(revision, key, value) 
  oldValue <- null 
  newRoot <- null 
 
  if result is a modified page 
    then  
      newRoot = result.modifiedPage 
      oldValue <- result.oldValue 
    else // It's a split page 
      newRoot <- new Node(revision) 
      newRoot.children[0] <- result.getLeftPage 
      newRoot.children[1] <- result.getRightPage 
      newRoot.keys[0] <- result.getPivot 
 
  roots.put(<revision, newRoot>) 
  currentRoot <- newRoot 
  return oldValue 
{code}
 h1. Delete
 ...

[CONF] Apache Labs > Algorithms

Algorithms

Page edited by Emmanuel Lécharny

Changes (4)

Full Content

Insert

Delete

Find

Browse

Page methods

findPos

Reply via email to