On Tue, Mar 12, 2013 at 10:23 PM, Emmanuel Lécharny <elecha...@gmail.com>wrote:
> > > Hi guys, > > in order to be able to recover from a crash in the middle of a BTree > update, I was hinking about an algorithm that could help. Let me explain > what I have in mind. > > When we store a BTree on disk, we write the BTree header which contains > informations like the offset of the root page on isk, and the current > revision. If we update the BTree, we will create new pages and a new > root page with a nex revision. We will hen have to update the btree > header so that it points to the new root page. For instance : > > T0 : initial state > BTree rootPage -> offset O1, revision N > > T1 : page addition > > BTree rootPage -> offset O1, revision N > new rootPage -> offset O2, revision N+1 > > T2 : btree header update > BTree rootPage -> offset O2, revision N+1 > > and we are set. > > Now, we may have a crash at any time. The worst case is when we have a > crash somewhere between [T1, T2[ (I exclude T2 because at T2 the new > header is written on disk). For instance, we may have a crash after > having added some value, but before having updated the header. Or we > could also have a crash while updating the header. > > How will we avoid such a problem ? And how can we detect that we have > had a crash, in order to restore the file in a proper state, with no > data pending on the air ? > > I suggest an algorithm that could work : > > 1) we need 2 revisions in the header : the current (and valid revision) > and the ongoing revision. > +1 > 2) if the two revisions are equals when we load the file, then we know > that the BTree is clean. We can boot without any problem 3) if the two revisions are different, that means we had a crash: we > will have to read all the pages, and discard the pending N+1 revision > pages. > > I would suggest that we keep the revisions N-1 and N and during each update the N-1 will be replaced with N and N with N+1 as long as the difference between revisions is 1 we can assume that there was no crash. In case of a crash we can start with the existing N-1 revision as the base to recover > In any case, before starting to write new pages, we increment the header > new revision number. Then we write all the pages, and when it's done, we > simply change the old revision in the header to the new revision. > > Reclaiming the pending pages is a matter of reading *all* the pages, and > for each page that is not linked to another one, we can safely move them > to the ist of free pages. As this is a expensive task, which requires a > lot of memory if we have lots of pages, we may also create a new file > containing only the latest valid revision. > > we should keep this as a background task > Thoughts ? > > -- > Regards, > Cordialement, > Emmanuel Lécharny > www.iktek.com > > -- > Kiran Ayyagari > <http://www.iktek.com>http://keydap.com