Hi, I was looking for a better algorithm to manage the pages that can be freed when some revisions are not used anymore. Atm, the algorithm was to keep a track of each modified pages and to associate them with a revision that copied them. When this revision get obsolote, we can safely remove the copied pages. Well, I thought it was enough, but this is not the case. It works only if we keep a track of the list of revisions that were removed, and only if we remove the revision that is on the left of the list (that eans we can safely remove any page copid by a revision which is not the oldest revision).
There is another solution though. We just have to use a BTree containing the page numbers as a key, associated with the revision that created the page as a value. Every time a revision is obsoleted, we can safely check the BTree, searching for pages copied by this revision, and removing the page from the btree, moving it to the free page BTree. For instance, let's consider the following scenario, where revisins are noted with capital letters, and pages with numbers. added copied A 1 / B 2,3 1->2 (1 becomes 2) C 4,5 2->4 D 6,7 3->7,4->6 E 8,9 5->9,6->8 The BTree will contains the following Key/values, if we never remove anything : 1 -> A 2 -> B 3 -> B 4 -> C 5 -> C 6 -> D 7 -> D 8 -> E 9 -> E Suppose we want to keep B and E revisions only. - When we add A, the BTree looks like : 1 -> A - When B is created, as we copied page 1, and as revision A is not kept, we can safely remove page 1 from the BTree and move it to the free pages BTree. The BTree looks lie : [1 -> A] removed 2 -> B 3 -> B - When C is created, as B is kept, we can't remove the <2, B> element The bTree looks like : 2 -> B 3 -> B 4 -> C 5 -> C - When D is created, as B is kept, we can't remove the <3, B> element, but we can remove the <4, C> element. The BTree looks like : 2 -> B 3 -> B [4 -> C] removed 5 -> C 6 -> D 7 -> D - When E is created, we can remove the 5 and 6 pages, as they were created by revisions that are'nt kept. The BTree looks like : 2 -> B 3 -> B [5 -> C] removed [6 -> D] removed 7 -> D 8 -> E 9 -> E - If we remove the B revision now, we can remove the page it created (2 and 3). In order to do that, we need to keep a track of the pages created by a specific revision. All in all, we have to maintaing 3 additional BTrees : - a first BTree containing the list of pages copied by a revision - a second BTree containing the revision associated with a specific page - a third one containing the list of pages created by a revision. IMO, there is one spurious BTree, and we should be able to deduce it from the two others, but I still have to come with a good algorithm for that. If anyone has a better idea, I'll be really pleased ! -- Regards, Cordialement, Emmanuel Lécharny www.iktek.com --------------------------------------------------------------------- To unsubscribe, e-mail: labs-unsubscr...@labs.apache.org For additional commands, e-mail: labs-h...@labs.apache.org