On Wed, Mar 13, 2013 at 2:55 PM, Emmanuel Lécharny <elecha...@gmail.com>wrote:
> Le 3/12/13 6:34 PM, Kiran Ayyagari a écrit : > > On Tue, Mar 12, 2013 at 10:23 PM, Emmanuel Lécharny <elecha...@gmail.com > >wrote: > > > > > > 3) if the two revisions are different, that means we had a crash: we > >> will have to read all the pages, and discard the pending N+1 revision > >> pages. > >> > >> I would suggest that we keep the revisions N-1 and N > > and during each update the N-1 will be replaced with N and N with N+1 > > as long as the difference between revisions is 1 we can assume that there > > was no crash. > > In case of a crash we can start with the existing N-1 revision as the > base > > to recover > > If we assume that revision N was ok, we can always recover from it. The > pending N+1 revision just means we are not clean with the new revision. > > The status will be : > > T0 : N and N > T1 (starting with a new revision : N and N+1 > T2 (done with the new revision) : N+1 and N+1 -> back to a stable state > > I have a little trouble to map what you propose on a time line : > > T0 : N-1 and N > T1 (starting with a new revision) : N-1, N and N+1 (is this correct ?) > T2 (done with the new revision) : N and N+1 > > the time line would be T0 : N-1 and N T1 (starting with a new revision) : N and NULL (replace N-1 with N and make the current revision as NULL, cause it is ongoing) T2 (done with the new revision) : N and N+1 (update the current version to N+1 _after_ updating the BTree) > Detecting that we had a failure in this cas would imply we have N-1 and > N+1, but what will keeping N good for ? > > Regarding my proposal : we will have to update the BTree header twice : > once when we start the modification, to add N+1 revision, and once at > the end to remove N and replace it with N+1. This is costly. Assuming > that those two elements are 2 longs, which will be stored on the same > page, or two pages at worst (if we have many BTrees and a BTRee header > span accross two physical pages), I'm not sure we can't simply update > the new status only once, at the end of the BTree update. > > I propose that we should update the header only after updating the Btree, this way if for _some_ reason Btree update fails we don't end up with a header pointing to this incorrect/non-existing revision > Another thing : the fact that BTree headers might span across two pages > is really annoying : we can have a crash after having updated the first > page but before updating the second one, leading to inconsistencies. An > option would be that each BTree header is stored in one single page, so > that we always store those informations in one single page. > > > +1 for header per page (we can even make the header page a little larger than the other pages if needed, not sure how it is going to impact the existing code, this is just an idea) > > > >> Reclaiming the pending pages is a matter of reading *all* the pages, and > >> for each page that is not linked to another one, we can safely move them > >> to the ist of free pages. As this is a expensive task, which requires a > >> lot of memory if we have lots of pages, we may also create a new file > >> containing only the latest valid revision. > >> > >> we should keep this as a background task > Sure, we can do that in order to avoid blocking the server for minutes > at startup. That's a smart idea ! > > -- > Regards, > Cordialement, > Emmanuel Lécharny > www.iktek.com > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: labs-unsubscr...@labs.apache.org > For additional commands, e-mail: labs-h...@labs.apache.org > > -- Kiran Ayyagari http://keydap.com