Le 30/10/15 04:51, Kiran Ayyagari a écrit : > On Thu, Oct 29, 2015 at 6:13 PM, Emmanuel Lécharny <[email protected]> > wrote: > >> Hi guys, >> >> here are a bit of my last night's insomnia thoughts (with the shower, >> and the smallest room of my appartment, this is the only place and time >> I have for such thoughts atm ;-) : >> >> - txn are now going to be explicit, ie you have to begin a txn and >> commit it before and after any update, like in : >> >> // Inject some data >> recordManager.beginTransaction(); >> btree.insert( 1L, "1" ); >> btree.insert( 4L, "4" ); >> btree.insert( 2L, "2" ); >> btree.insert( 3L, "3" ); >> btree.insert( 5L, "5" ); >> recordManager.commit(); >> > if a user forgets to call commit() and keeps inserting we will run out of > memory at some point > so wondering what is the better way to prevent that from happening without > maintaining a > log (a WAL or something similar)
I don't really see how to do that, sadly... In some ways, we are really in a similar situatio as when you were managing memory in C : if you forget to call free(), you have a leak... > >> - later on, we may include the txn begin/commit into the update >> operation, on a btree basis. >> - the direct consequence is that we don't write any thing on the disk >> before we commit. This will have a huge impact on performances as every >> single update operation involve many internal updates : BOB Btree, CPB >> Btree, RMH. Typically, in the previous example, we will save 5 updates >> of teh BOB and CPB, plus numerous updates on the RMH, not counting the 4 >> updates of the btree root page and btree header. >> - of course, this is not coming for free : if one forget to commit or >> rollback the txn, then the pages will remain in memory... >> - in order to manipulate those pages in memory, we will keep a track of >> the update pages and btree headers in memory. We use Map for that : >> Map<Long, PageIO> and Map<Long, Page> >> - now, we have two kind of map : one that keep a track on PageIO and the >> other one that keep a track on Page. Why is that ? Because the RMH and >> the BHs are not holding any value, they are specific data structures. >> Every other Page hold Keys/Values, where value sare data or pointer to >> a child Page. >> - we need a way to reference those Page and PageIO. For PageIO, this is >> easy : we use the page offset. For plain Page, this is a different >> story, because we may create new pages (actually, this *is* what we do >> for every update, except that as we work in memory during a txn, we can >> *modify* an existing page instead of copy it, until we need to write it >> down on disk. Note that the very first time we access to a Page, either >> from teh disk or from the cache, we have to copy it, no matter what). >> - Those new pages don't have an offset, because we haven't written them >> to disk yet. The reason is that writing such a page to the disk means we >> have serialized the page - an expensive operation we want to avoid until >> we have to do it, ie when we write the page to the disk -, and we have >> fetched as many PageIO as necessary to write the page to the disk >> (fetching a new PageIO will first look in the Free Page list or grab new >> PageIO at the end of the file, expending the file). >> - So we have to find a way to identify those pages by other means. One >> idea is to use an incremental ID associated with the txn. Every Page >> will have a Long ID. >> - now, the pb is that we use offset in pages to reference child pages. >> We must be able to distinguish an offset from an ID. We will use a >> negative long for PageID. >> - when we commit the txn, we will have to update all the references to >> IDs to replace them with offsets. At this point, the easiest way is to >> read all the Nodes we keep in the Map, and check all of their values to >> see if we are refering to an ID instead of an Offset, and change that. >> - that drives the order in which we write the pages : we have to start >> from the leaves, and up to the root page >> - another approach would be to keep a track of the modified values for >> each Node (ie, value X has been change is now reffer to page Y, etc...). >> This will be memory consuming, but CPU efficient. All in all, it's >> another Map to handle : Map<PageID, Set<Modified Value position>>. This >> approach is probably what we want to use... >> - modifying a page instead of copying it requires we add new methods in >> the Btree class. Not a huge change. >> - a roolback is just a matter of deleting the txn context: straightforward >> ! >> >> At this point, I think we have a roadmap for txn support. I said nothing >> about managing free pages here, it has already been discussed in another >> thread, but I'll come later with a more explicit mail. >> >> I'm curently working in a branch, in which I have removed the InMemory >> Btrees and the support for multiple values. The branch has been created, >> but I haven't commit yet anything in it : I have too many broken things. >> I'll commit asap (may be this week-end if I have time to work on my free >> time - familly comes first !) >> > please commit it even if it is broken, I might be able to understand the > above ideas > if I read it again after looking at code > > thanks Emmanuel > >> Thanks ! >> >> BOB : Btree-Of-Btrees, the Btree that hold the revisions in use >> CPB : Copied-Pages-Btree, the btree that holds teh list of pages that >> have been copied during an update >> RMH : RecordManager Header, the data structure that maintains the >> pointers to the BOB and CPB, both the current and previous, plus some >> extra informations, like the FreeList page >> BH : Btree Header, the structure that stors meta-information about a >> B-tree (serializers, comparators, nb elements, pointer to the root page, >> etc) >> >> >
