[CONF] Apache Labs > Algorithms

confluence Sun, 24 Jun 2012 07:36:22 -0700

     Algorithms
    Page edited by             Emmanuel Lécharny
    
        

                         Changes (9)
                                 
    

                    
    
            ...

            ## UC1.1 : The key is not the first one 
## UC1.2 : The key is the first one 

            ### UC1.2.1 : The key is referenced by the leftmost parent's children  
### UC1.2.2 : The key is referenced by any of its parent's children but the leftmost 

            # UC2 : The page has N/2 elements 
## UC2.1 : The sibling with the same parent has more than N/2 keys 

            ...

            {code} 
 

            h4. UC 1.2.1 and UC 1.2.2 The page has more than N/2 elements, the key is the first one 

             

            This is slightly more complex, as we have to propagate to the caller the changed left key. There are two different cases here, too. The first one (UC 1.2.1) is when the modified page is pointed by the leftmost reference on its parent, and the second one (UC 1.2.2) when it's pointed by any other reference in its parent. Here is a picture that expose both modifications : 

            This is slightly more complex, as we have to propagate to the caller the changed left key. As the first key is present in some of the parents, we have to return it so that the caller can eventually update itself. However, there are some case where the update does not occurs. 

             

            We do update the parent's reference when the reference key is also present in the parent's page. This is describe in the following algorithm. 

             

            {code:title=delete(K)|borderStyle=solid} 
delete(K) : 
  deleteResult <- leaf.delete(K) 
  if deleteResult is a modified page 
    then  
      new page <- copy the page 
      if the modified page does not have a new left element 
        then  
          // continue with UC 1.1... (see above) 
        else 
          if the page contains K 
            then  
               replace the old key by the new left element in the new page 

             

                           if the new page has the new key on the left 
                 then  
                   return the modified page with the new left element 
                 else 
                   return the modified page 
    else 
      // continue with UC 2.2... 
{code} 

             

            The following picture shows some example for this use case : 

             

             
!UC1.2.2.png|border=1! 
 
 

            h2. find() method 
 

            ...

    
            
    
                            Full Content
                    
        Let's describe the algorithms we will implement.

BTree methods
insert() method

When we try to insert a tuple <K, V> into a tree, we start from the root page, which can be either a Leaf or a Node. In any case, we call the insert(K, V) method on the BTree, which will get a new revision that will become the new root if the insertion succeed.

void insert(K, V)

BTree.insert(key, value) : (K, V) -> V
  get new revision
  return insert(revision, key, value)


This internal method will try to insert the tuple in the root page, and depending on the result, will either switch the root page to the newly created page, or to create a new root page containing the pivot if the old root page has been split

void insert(revision, K, V)

BTree.insert(revision, key, value) : (long, K, V) -> V
  currentRoot <- get the current root page
  result <- currentRoot.insert(revision, key, value)
  oldValue <- null
  newRoot <- null

  if result is a modified page
    then 
      newRoot <- result.modifiedPage
      oldValue <- result.oldValue
    else // It's a split page
      newRoot <- new Node(revision)
      newRoot.children[0] <- result.getLeftPage
      newRoot.children[1] <- result.getRightPage
      newRoot.keys[0] <- result.getPivot

  roots.put(<revision, newRoot>)
  currentRoot <- newRoot
  return oldValue



delete() method

Deleting an element from a leaf is pretty simple : we just have to find its position, and if found, to copy the page, and to remove the element from it. The problem is that doing so may result in a leaf with less than N/2 elements, which is not allowed for a leaf, except if it's the root page. So we have to either add an element into the page to make it contains N/2 elements, or to merge it with a page that contains N/2 elements. Then if we merge two pages, we will have to remove an element in the parent page, with the exact same consequences. At the end, the whole tree may be impacted.

Here, we can't really distinguish the algorithm used to delete an element from a Leaf from the one applied on a Node. We will study the global algorithm, and then describe the specific algorithms for a Leaf and for a Node.

General algorithm

There are a few cases we should consider :

	UC0 : The page is a Leaf but is also the root page
	UC1 : The page has more than N/2 elements
	
		UC1.1 : The key is not the first one
		UC1.2 : The key is the first one
	
	
	UC2 : The page has N/2 elements
	
		UC2.1 : The sibling with the same parent has more than N/2 keys
		
			UC2.1.1 : The sibling is on the left
			UC2.1.2 : The sibling is on the right
		
		
		UC2.2 : The sibling with the same parent has N/2 keys
		
			UC2.2.1 : The sibling is on the left
			UC2.2.2 : The sibling is on the right
		
		
	
	



We will now list all those use cases and provide the algorithm for each of them.

UC0 : The page is a Leaf but is also the root page

This is the simplest case. We just remove the key from the page, after having copied it, and this will become the new root.

delete(K)

delete(K) :
  if page is a leaf and page is root
    then 
      copy the page
      delete the key
      change the root to be the new page



UC1.1 : The page has more than N/2 elements, the key is not the first one

This is also a simple use case : we just have to remove the key from the copied page, and copy all the parents up to the root with a new revision. The following picture shows the modification done on the tree.



Here is the associated algorithm :

delete(K)

delete(K) :
  if page has more than N/2 element an the key is not the leftmost one
    then 
      copy the page
      delete the key
      return the modified page with the removed <K, V>



The caller will just copy the current page and point to the modified page :

delete(K)

delete(K) :
  deleteResult <- leaf.delete(K)
  if deleteResult is a modified page
    then 
      new page <- copy the page
      if the modified page does not have a new left element
        then 
          reference the modified page in the new page
          return the modified page
        else
          // continue with UC 1.2...
    else
      // continue with UC 2.2...



UC 1.2 The page has more than N/2 elements, the key is the first one

This is slightly more complex, as we have to propagate to the caller the changed left key. As the first key is present in some of the parents, we have to return it so that the caller can eventually update itself. However, there are some case where the update does not occurs.

We do update the parent's reference when the reference key is also present in the parent's page. This is describe in the following algorithm.

delete(K)

delete(K) :
  deleteResult <- leaf.delete(K)
  if deleteResult is a modified page
    then 
      new page <- copy the page
      if the modified page does not have a new left element
        then 
          // continue with UC 1.1... (see above)
        else
          if the page contains K
            then 
               replace the old key by the new left element in the new page

               if the new page has the new key on the left
                 then 
                   return the modified page with the new left element
                 else
                   return the modified page
    else
      // continue with UC 2.2...



The following picture shows some example for this use case :





find() method

This method is used to find a specific value, given a key. 

We will have to go down the tree up to the leaf that contains the value, if the key is present in the tree. We will work on the latest revision available.

V find(K)

BTree.find(key) : K -> V
  return currentRoot.find( key )



browse() method

cursor<V> browse(K)




    
        
        
            Change Notification Preferences
        
        View Online
        |
        View Changes
                |
        Add Comment
            







---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
 ...
 ## UC1.1 : The key is not the first one 
## UC1.2 : The key is the first one
-### UC1.2.1 : The key is referenced by the leftmost parent's children  
### UC1.2.2 : The key is referenced by any of its parent's children but the leftmost
 # UC2 : The page has N/2 elements 
## UC2.1 : The sibling with the same parent has more than N/2 keys
 ...
 {code}
 h4. UC 1.2.1 and UC 1.2.2 The page has more than N/2 elements, the key is the first one
-This is slightly more complex, as we have to propagate to the caller the changed left key. There are two different cases here, too. The first one (UC 1.2.1) is when the modified page is pointed by the leftmost reference on its parent, and the second one (UC 1.2.2) when it's pointed by any other reference in its parent. Here is a picture that expose both modifications :
+This is slightly more complex, as we have to propagate to the caller the changed left key. As the first key is present in some of the parents, we have to return it so that the caller can eventually update itself. However, there are some case where the update does not occurs.
+We do update the parent's reference when the reference key is also present in the parent's page. This is describe in the following algorithm.
+{code:title=delete(K)|borderStyle=solid} 
delete(K) : 
  deleteResult <- leaf.delete(K) 
  if deleteResult is a modified page 
    then  
      new page <- copy the page 
      if the modified page does not have a new left element 
        then  
          // continue with UC 1.1... (see above) 
        else 
          if the page contains K 
            then  
               replace the old key by the new left element in the new page
+               if the new page has the new key on the left 
                 then  
                   return the modified page with the new left element 
                 else 
                   return the modified page 
    else 
      // continue with UC 2.2... 
{code}
+The following picture shows some example for this use case :
+!UC1.2.2.png|border=1!
 h2. find() method
 ...

[CONF] Apache Labs > Algorithms

Algorithms

Page edited by Emmanuel Lécharny

Changes (9)

Full Content

BTree methods

insert() method

delete() method

General algorithm

UC0 : The page is a Leaf but is also the root page

UC1.1 : The page has more than N/2 elements, the key is not the first one

UC 1.2 The page has more than N/2 elements, the key is the first one

find() method

browse() method

Reply via email to