On Fri, Jan 20, 2012 at 4:48 PM, Emmanuel Lecharny <[email protected]> wrote: > Just FYI, I will continue to play around the BTree I've pushed in my branch. > It's funny, and it helps me to understand the concepts without having to go > deep into books and existing algorithm. Call it an exercise, this is exactly > what it is. > > As far as I can tell, it also seems that JDBM has a few sub-optimal methods > : to insert N nodes into a BTree, JDBM do 10% more comparison than I do in > my implementation. We can probably shave those extra 10% from JDBM code > later (I tried, but it had some bad side effect when done brutally). > > Otherwise, I do think that if your modified JDBM already implements the MVCC > Btree, then we should use it in trunk. > > I'm also wondering if there is any added GC in the modified JDBM code ? Just > because if we don't have one, the JDBM files will grow very quickly.
there is GC in the added code. > > > > On 1/20/12 10:07 PM, Selcuk AYA wrote: >> >> On Fri, Jan 20, 2012 at 9:47 AM, Emmanuel Lecharny<[email protected]> >> wrote: >>> >>> Hi guys, >>> >>> I was a bit silent last week and this week, let me update you about what >>> I >>> was working on. >>> >>> First of all, I have had to deal with some familly issues, which ate half >>> of >>> my time. >>> >>> Regarding the Txn branch I was working on until last week, I stopped >>> because >>> I was not able to fix the code without a serious help from Selcuk. As he >>> is >>> busy, I preferred to wait for him to be available again, instead of >>> bullying >>> into the code and break it seriously. I believe that there are some >>> improvement since the moment I started to work on the branch, but it's >>> not >>> working fully yet. >>> >>> So I switched to something we wanted to do a long time ago : designing a >>> new >>> version of JDBM. JDBM is a BTree implementation, with locks to protect >>> concurrent access. The idea was to implement a MVCC solution on top of a >>> BTree : >>> - each search can be done concurrently with any other operation, because >>> it >>> asks for a specific existing revision from the btree >>> - each modification is done on a new revision >>> - two modifications can't be done at the same time (so modifications are >>> queued and executed one after the other) >> >> >> Exactly what you are saying here is already implemented in JDBM thanks >> to the latest changes. How is what you are implementing is different? >> >>> The consequence is that searches will be very fast. It comes to a price >>> though : we keep a track on every revision, until it's not used anymore. >>> This is done by copying every modified pages when applying some >>> modification. >>> >>> As of today, the addition operation and the find operation is working >>> just >>> fine. I conducted some benchmark on additions, and it seems that the >>> system >>> is pretty decent. >>> >>> A *lot* remains to be done : >>> - deletion must be implemented >>> - browsing the tree is not yet implemented >>> - it's all in memory atm >>> - we must add some semaphore for concurrent modifications >>> - a GC must be added to discard unused pages >>> >>> But most of all, as it's a in-memory btree atm, I must add the disk >>> layer. >>> It will be based on Memory Mapped files. >>> >>> Once those preliminary works will be done, the idea is to use this >>> implementation to replace JDBM. That would make the server consistent, >>> and >>> we may then use it without the in-memory txn layer. >>> >>> Not to say that this txn layer is useless; using a MVCC btree based >>> backend >>> is *not* enough : we have no way to guarantee the atomicity of move >>> operation across partitions. >> >> Txn layer enables us to group multiple atomic atomic changes and >> consistent search into a single atomic unit. For example, when you >> change an entry, change to the entry and indices happen as an atomic >> unit. Do you have code to provide this at your new implementation? >> >>> This work has been done in my sandbox, where you can follow the work in >>> progress : >>> http://svn.apache.org/viewvc/directory/sandbox/elecharny/shared-mvbt >>> >>> At the same time, thanks to Pierre-Arnaud, a first milstone of Studio 2.0 >>> has been released, and it exposed some nasty bugs in the LDAP API. Which >>> is >>> actually a good thing : we can fix them ! >>> >>> So keep tuned, a lot of new things are coming soon ! >>> >>> -- >>> Regards, >>> Cordialement, >>> Emmanuel Lécharny >>> www.iktek.com >>> > > > -- > Regards, > Cordialement, > Emmanuel Lécharny > www.iktek.com >
