[Mavibot] Free page management

Emmanuel Lécharny Sun, 28 Apr 2013 09:27:00 -0700

Hi,

I have started to work on the free page management. There are a few
steps to fulfill to get it working and the very first one is to gather
all the copied pages into a BTree. The Btree already exists, it's the
copiedPageBTree.


When we add or delete some element in a BTree, we do that using a new
revision, and many pages will be copied : we want to get the list of
those pages and to store this list associated with the revision.

I have extended the various Result classes to be able to store the
copied pages in the result. The following hierarchy can probably fit our
need :

  (Result) o----- [[AbstractResult]]
      ^                ^ ^  ^ ^
      |                | |  | |
      +-- (DeleteResult) <-------------------------
(BorrowedFromSiblingResult)
      |         o      | |  | |                                  o
      |         |      | |  | |                                  |
      |         +-- [NotPresentResult]                           |
      |         |      | |    |                                  |
      |         +-- [[AbstractDeleteResult]] <--
[AbstractBorrowedFromSiblingResult]
      |                | |    ^                                  ^
      |                | |    |                                  |
      |                | |    |                                  +--
[BorrowedFromLeftResult]
      |                | |    |                                  |
      |                | |    |                                  +--
[BorrowedFromRightResult]
      |                | |    |
      |                | |    +-- [MergedWithSiblingResult]
      |                | |    |
      |                | |    +-- [RemoveResult]
      |                | |
      +-- (InsertResult) |
                o      | |
                |      | |
                +-- [ModifyResult]
                |        |
                +-- [SplitResult]

Now, there are a things to take care :

- The result of each operation can change as we may have many layer.
Whenever we have to create a new Result instance, we have to keep track
of the list of copied pages
- when we merge two pages, the merged pages must both be added into the
copied pages list, even if only one of the two pages is copied

Right now, the following classes and interfaces have been created :

- Result :

/* No qualifier */interface Result<K, V>
{
    /**
     * @return the copiedPage
     */
    /* No qualifier */List<Page<K, V>> getCopiedPage();


    /**
     * Add a new copied page
     * @param copiedPage the added page
     */
    /* No qualifier */void addCopiedPage( Page<K, V> copiedPage );
}

- AbstractResult :

/* No qualifier */abstract class AbstractResult<K, V> implements
Result<K, V>
{
    /** The list of copied page reference */
    private List<Page<K, V>> copiedPage;


    /**
     * The default constructor for AbstractResult.
     *
     */
    /* No qualifier */AbstractResult()
    {
        copiedPage = new ArrayList<Page<K, V>>();
    }


    /**
     * {@inheritDoc}
     */
    public List<Page<K, V>> getCopiedPage()
    {
        return copiedPage;
    }


    /**
     * {@inheritDoc}
     */
    public void addCopiedPage( Page<K, V> page )
    {
        copiedPage.add( page );
    }
}

I still have to check that it's enough, and I'm now using those classes
in the Leaf and Node classes. I will provide some feedback as soon as
it's done (it should not take too long).

The next step is to store the resulting list into the BTree, which
should not be versionned (ie we only keep a track of the latest
revision). This is also a something we have to take care of, as we don't
want to keep a track of the copied page when updating this BTree, this
will simply kill the process (every modification will lead to some other
modification, typical chicken and egg pb). Bottom line, this BTree
copied pages will be immediately moved to the free pages set.

One important point is that the copiedPage BTree key is a composition of
the BTree name and revision : it manages all the existig BTrees.

Then we have to keep a track of the free pages, and manage the cleanup
of the pages in this BTree. The following algorihm can be used :
- when a revision is released, and if there is no older revision kept,
then we can reclaim all the copied pages up to the new older revision
(or the last one). This is done by fetcjing all the copied page list
revision by revision starting from the released revision, up to the new
older revision.
- If the released revision is between two revisions, then it's slightly
more complex. We can only release pages that have been copied since the
released revision has been created. That means we can only fetch all the
copied pages and get rid of those which revision is older than the
released revision (this is quite a costly processing, but it might be
necessary).

All in all, this is an expensive operation.

And this is not the end of the story : once we have selected the page to
free, we have to move them to a space where they can be reused. This is
the most complex part of the job, as we must make this reliable, even if
the system crashes. This is the last part of the work...

More to come.

-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com 


---------------------------------------------------------------------
To unsubscribe, e-mail: labs-unsubscr...@labs.apache.org
For additional commands, e-mail: labs-h...@labs.apache.org

[Mavibot] Free page management

Reply via email to