[CONF] Apache Labs > Algorithms

confluence Fri, 22 Jun 2012 01:04:27 -0700

     Algorithms
    Page edited by             Emmanuel Lécharny
    
        

                         Changes (4)
                                 
    

                    
    
            ...

            h2. delete() method 
 

            TO BE DONE 

            Deleting an element from a leaf is pretty simple : we just have to find its position, and if found, to copy the page, and to remove the element from it. The problem is that doing so may result in a leaf with less than N/2 elements, which is not allowed for a leaf, except if it's the root page. So we have to either add an element into the page to make it contains N/2 elements, or to merge it with a page that contains N/2 elements. Then if we merge two pages, we will have to remove an element in the parent page, with the exact same consequences. At the end, the whole tree may be impacted. 

             

            Here, we can't really distinguish the algorithm used to delete an element from a _Leaf_ from the one applied on a _Node_. We will study the global algorithm, and then describe the specific algorithms for a _Leaf_ and for a _Node_. 
 
h3. General algorithm 
 
There are a few cases we should consider : 
# The page is a _Leaf_ but is also the root page : simply remove the element 
# The page has more than N/2 elements :  
## The element is on the left : remove it and return the next element to the parent, as the parent's key will change.  
## The element is on the middle of the page : simply remove it 
# The page has exactly N/2 elements  
## There is a sibling with more than N/2 elements  
### The sibling has the same parent : borrow the element from it (and update the parent accordingly if we get the element from the right sibling 
### The sibling has a different parent : update the parent's sibling up to the page which does not have the moved key on its left 
## Both siblings have N/2 elements : select the sibling which has the same parent page, and merge it with the current page. This is the most complex case, as the resulting parent's page may contain less than N/2 elements, and we have to recursively apply the same algorithm. If we reach the root, which has only one element, then we will have to remove the root page and use the merged pages as the new root. 
 
Let's see how it translates using some pseudo-code 
{code} 
delete( key ) : K 
{code} 
 

            {code:title=V delete(K)|borderStyle=solid} 

              currentRoot <- get the current root page 
 
  if the currenRoot is a leaf 
    then remove the key from it and return 
 
  result <- currentRoot.delete( key ) 
 
  if the result is a mergedPage 
    then 
      if the currentRoot has more than one element 
        then remove the old pivot and update the root page references to the child 
      else 
        the mergedPage is the new root 
    else 
      if the result has a modified pivot 
        then  
          change the pivot in the root page 

            {code} 
 

            ...

    
            
    
                            Full Content
                    
        Let's describe the algorithms we will implement.

BTree methods
insert() method

When we try to insert a tuple <K, V> into a tree, we start from the root page, which can be either a Leaf or a Node. In any case, we call the insert(K, V) method on the BTree, which will get a new revision that will become the new root if the insertion succeed.

void insert(K, V)

BTree.insert(key, value) : (K, V) -> V
  get new revision
  return insert(revision, key, value)


This internal method will try to insert the tuple in the root page, and depending on the result, will either switch the root page to the newly created page, or to create a new root page containing the pivot if the old root page has been split

void insert(revision, K, V)

BTree.insert(revision, key, value) : (long, K, V) -> V
  currentRoot <- get the current root page
  result <- currentRoot.insert(revision, key, value)
  oldValue <- null
  newRoot <- null

  if result is a modified page
    then 
      newRoot <- result.modifiedPage
      oldValue <- result.oldValue
    else // It's a split page
      newRoot <- new Node(revision)
      newRoot.children[0] <- result.getLeftPage
      newRoot.children[1] <- result.getRightPage
      newRoot.keys[0] <- result.getPivot

  roots.put(<revision, newRoot>)
  currentRoot <- newRoot
  return oldValue



delete() method

Deleting an element from a leaf is pretty simple : we just have to find its position, and if found, to copy the page, and to remove the element from it. The problem is that doing so may result in a leaf with less than N/2 elements, which is not allowed for a leaf, except if it's the root page. So we have to either add an element into the page to make it contains N/2 elements, or to merge it with a page that contains N/2 elements. Then if we merge two pages, we will have to remove an element in the parent page, with the exact same consequences. At the end, the whole tree may be impacted.

Here, we can't really distinguish the algorithm used to delete an element from a Leaf from the one applied on a Node. We will study the global algorithm, and then describe the specific algorithms for a Leaf and for a Node.

General algorithm

There are a few cases we should consider :

	The page is a Leaf but is also the root page : simply remove the element
	The page has more than N/2 elements :
	
		The element is on the left : remove it and return the next element to the parent, as the parent's key will change.
		The element is on the middle of the page : simply remove it
	
	
	The page has exactly N/2 elements
	
		There is a sibling with more than N/2 elements
		
			The sibling has the same parent : borrow the element from it (and update the parent accordingly if we get the element from the right sibling
			The sibling has a different parent : update the parent's sibling up to the page which does not have the moved key on its left
		
		
		Both siblings have N/2 elements : select the sibling which has the same parent page, and merge it with the current page. This is the most complex case, as the resulting parent's page may contain less than N/2 elements, and we have to recursively apply the same algorithm. If we reach the root, which has only one element, then we will have to remove the root page and use the merged pages as the new root.
	
	



Let's see how it translates using some pseudo-code

delete( key ) : K



V delete(K)

  currentRoot <- get the current root page
  
  if the currenRoot is a leaf
    then remove the key from it and return

  result <- currentRoot.delete( key )

  if the result is a mergedPage
    then
      if the currentRoot has more than one element
        then remove the old pivot and update the root page references to the child
      else
        the mergedPage is the new root
    else
      if the result has a modified pivot
        then 
          change the pivot in the root page




find() method

This method is used to find a specific value, given a key. 

We will have to go down the tree up to the leaf that contains the value, if the key is present in the tree. We will work on the latest revision available.

V find(K)

BTree.find(key) : K -> V
  return currentRoot.find( key )



browse() method

cursor<V> browse(K)




    
        
        
            Change Notification Preferences
        
        View Online
        |
        View Changes
                |
        Add Comment
            







---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
 ...
 h2. delete() method
-TO BE DONE
+Deleting an element from a leaf is pretty simple : we just have to find its position, and if found, to copy the page, and to remove the element from it. The problem is that doing so may result in a leaf with less than N/2 elements, which is not allowed for a leaf, except if it's the root page. So we have to either add an element into the page to make it contains N/2 elements, or to merge it with a page that contains N/2 elements. Then if we merge two pages, we will have to remove an element in the parent page, with the exact same consequences. At the end, the whole tree may be impacted.
+Here, we can't really distinguish the algorithm used to delete an element from a _Leaf_ from the one applied on a _Node_. We will study the global algorithm, and then describe the specific algorithms for a _Leaf_ and for a _Node_. 
 
h3. General algorithm 
 
There are a few cases we should consider : 
# The page is a _Leaf_ but is also the root page : simply remove the element 
# The page has more than N/2 elements :  
## The element is on the left : remove it and return the next element to the parent, as the parent's key will change.  
## The element is on the middle of the page : simply remove it 
# The page has exactly N/2 elements  
## There is a sibling with more than N/2 elements  
### The sibling has the same parent : borrow the element from it (and update the parent accordingly if we get the element from the right sibling 
### The sibling has a different parent : update the parent's sibling up to the page which does not have the moved key on its left 
## Both siblings have N/2 elements : select the sibling which has the same parent page, and merge it with the current page. This is the most complex case, as the resulting parent's page may contain less than N/2 elements, and we have to recursively apply the same algorithm. If we reach the root, which has only one element, then we will have to remove the root page and use the merged pages as the new root. 
 
Let's see how it translates using some pseudo-code 
{code} 
delete( key ) : K 
{code}
 {code:title=V delete(K)|borderStyle=solid}
+  currentRoot <- get the current root page 
 
  if the currenRoot is a leaf 
    then remove the key from it and return 
 
  result <- currentRoot.delete( key ) 
 
  if the result is a mergedPage 
    then 
      if the currentRoot has more than one element 
        then remove the old pivot and update the root page references to the child 
      else 
        the mergedPage is the new root 
    else 
      if the result has a modified pivot 
        then  
          change the pivot in the root page
 {code}
 ...

[CONF] Apache Labs > Algorithms

Algorithms

Page edited by Emmanuel Lécharny

Changes (4)

Full Content

BTree methods

insert() method

delete() method

General algorithm

find() method

browse() method

Reply via email to