A second thought about the Map kept in the txn context : actually, we don't need two maps (on for the PageIO, and one for the Page), we just need to keep a reference to the objects, whether they are instance of Node or Leaf, or BTreeHeader, or BtreeInfo, or even RecordManagerHeader. It's enough to have a map on <PageID, Object> where the PageID is an incremental number referencing the modified object. We will convert this pageId to an offset later on. All in all, this is a WAL...
Le 29/10/15 11:13, Emmanuel Lécharny a écrit : > Hi guys, > > here are a bit of my last night's insomnia thoughts (with the shower, > and the smallest room of my appartment, this is the only place and time > I have for such thoughts atm ;-) : > > - txn are now going to be explicit, ie you have to begin a txn and > commit it before and after any update, like in : > > // Inject some data > recordManager.beginTransaction(); > btree.insert( 1L, "1" ); > btree.insert( 4L, "4" ); > btree.insert( 2L, "2" ); > btree.insert( 3L, "3" ); > btree.insert( 5L, "5" ); > recordManager.commit(); > > - later on, we may include the txn begin/commit into the update > operation, on a btree basis. > - the direct consequence is that we don't write any thing on the disk > before we commit. This will have a huge impact on performances as every > single update operation involve many internal updates : BOB Btree, CPB > Btree, RMH. Typically, in the previous example, we will save 5 updates > of teh BOB and CPB, plus numerous updates on the RMH, not counting the 4 > updates of the btree root page and btree header. > - of course, this is not coming for free : if one forget to commit or > rollback the txn, then the pages will remain in memory... > - in order to manipulate those pages in memory, we will keep a track of > the update pages and btree headers in memory. We use Map for that : > Map<Long, PageIO> and Map<Long, Page> > - now, we have two kind of map : one that keep a track on PageIO and the > other one that keep a track on Page. Why is that ? Because the RMH and > the BHs are not holding any value, they are specific data structures. > Every other Page hold Keys/Values, where value sare data or pointer to > a child Page. > - we need a way to reference those Page and PageIO. For PageIO, this is > easy : we use the page offset. For plain Page, this is a different > story, because we may create new pages (actually, this *is* what we do > for every update, except that as we work in memory during a txn, we can > *modify* an existing page instead of copy it, until we need to write it > down on disk. Note that the very first time we access to a Page, either > from teh disk or from the cache, we have to copy it, no matter what). > - Those new pages don't have an offset, because we haven't written them > to disk yet. The reason is that writing such a page to the disk means we > have serialized the page - an expensive operation we want to avoid until > we have to do it, ie when we write the page to the disk -, and we have > fetched as many PageIO as necessary to write the page to the disk > (fetching a new PageIO will first look in the Free Page list or grab new > PageIO at the end of the file, expending the file). > - So we have to find a way to identify those pages by other means. One > idea is to use an incremental ID associated with the txn. Every Page > will have a Long ID. > - now, the pb is that we use offset in pages to reference child pages. > We must be able to distinguish an offset from an ID. We will use a > negative long for PageID. > - when we commit the txn, we will have to update all the references to > IDs to replace them with offsets. At this point, the easiest way is to > read all the Nodes we keep in the Map, and check all of their values to > see if we are refering to an ID instead of an Offset, and change that. > - that drives the order in which we write the pages : we have to start > from the leaves, and up to the root page > - another approach would be to keep a track of the modified values for > each Node (ie, value X has been change is now reffer to page Y, etc...). > This will be memory consuming, but CPU efficient. All in all, it's > another Map to handle : Map<PageID, Set<Modified Value position>>. This > approach is probably what we want to use... > - modifying a page instead of copying it requires we add new methods in > the Btree class. Not a huge change. > - a roolback is just a matter of deleting the txn context: straightforward ! > > At this point, I think we have a roadmap for txn support. I said nothing > about managing free pages here, it has already been discussed in another > thread, but I'll come later with a more explicit mail. > > I'm curently working in a branch, in which I have removed the InMemory > Btrees and the support for multiple values. The branch has been created, > but I haven't commit yet anything in it : I have too many broken things. > I'll commit asap (may be this week-end if I have time to work on my free > time - familly comes first !) > > Thanks ! > > > BOB : Btree-Of-Btrees, the Btree that hold the revisions in use > CPB : Copied-Pages-Btree, the btree that holds teh list of pages that > have been copied during an update > RMH : RecordManager Header, the data structure that maintains the > pointers to the BOB and CPB, both the current and previous, plus some > extra informations, like the FreeList page > BH : Btree Header, the structure that stors meta-information about a > B-tree (serializers, comparators, nb elements, pointer to the root page, > etc) >
