Hi, after a week of work, I was able to add the cache in Mavibot. That was not that easy, because the cache should only be used to store 'real' pages, not pages that are being created during a transaction.
When we insert or update a <key/value>, we will update some nodes and leaves. During a transaction, we will create new pages, which should not appear in the cache, as they don't really exist until we commit them. The problem is that we stll have to refer those new pages in Nodes which refer them, and we can't use an offset for that as those pages have not yet been flushed to disk. The idea was to use the page ID, which is unique, as a reference in memory. When we comit the txn, we substitute those IDs by the referenced offset. This is possible because the referenced pages have already been flushed to disk. The only drawback is that we have to check every child of each Node we have to flush to replace the ID by the page offset. We can live with that. Last thing : in order to not confuse an offset with an ID, we store ID as negative values in Nodes' children. That limits us to Long.MAX_VALUE possible pages and ID, but it's ok : it's 9 223 372 036 854 775 807 pages... (9 quintillions). Even if we update 1 billion page per second, it would take the system 292 years to exhaust the ID counter, and offset are always > 0. I still have one remaning issue : the ID counter *must* be stored withing the recordManagerHeader. That will be done tomorrow.` Once that done, the txn support will be completed. That means we will have a working mavibot, except that the system won't reclaim unused pages. This is the next step : get rid of unused pages, ie revive teh Reclaimer Kiran has written. I'll work on that next week. -- Emmanuel Lecharny Symas.com directory.apache.org
