Hi guys, a quick head up about this on-going effort.
First of all, I added the Transaction extended operation in the LDAP API, but this is somehow orthogonal. We don't really need that at the moment in the server, but we will most certainly leverage it later for some interesting feature (see later). At the moment, the idea is to add cross- B-tree transaction to partitions (at least JDBM/Mavibot partitions). This is critical because it will fix the corruption problem we have. The idea is to start a transaction in the OperationManager, either re ad or write dependening on the operation. We can have many read operations going on, but only one write operation (for the JDBM partition, we will have some more constraints). Transactions have to be started by partitions, as the upper layer (ie operation manager has no way to know how the lower level (ie partitions) deal with transactions. This is possible because we can determinate which partition we are addressing using the operation's DN. The first thing to do is to move this part from partitions to the operation manager. Then we have to propagate the txn down to the nexus. The easier way is to pass the txn into the OperationContext instances. Once we went through the interceptors down to the Nexus partition, we have to apply the operation to the specific partition, and this is done by the AbstractBtreePartition, mostly. Enough said that each B-tree update needs to know about the txn, so we have to modify the basic partition operations to take an extra parameter : the txn. And this is where it starts to be hard... Because that implies we also have to extend the following interfaces : - Table - Index - Cursor so that they also take this txn as a parameter. It's a bit of a gigantic change in the interfaces... Note that we don't want to change the LDAP API cursor interface, too. Then , we have to change teh way teh JDBM partition behave. Currently, we create a RecordManager for each B-tree (ie, each Table, which may have 2 B-tree, the forward and reverse index). This is not good, because we can't apply a global operation across many recordManager using a txn. The JDBM transactionManager is applied to a single RecordManager. So the next big change is to have JDBM working using one single file. That impacts the initialisation of JDBM index and partition. For mavibot, it's simpler, because we already use one single file anyway. ATM, I have made the required changes in the Partition/Index/Cursor to pass this extrac txn paramer, and I'm dealing with the unique file definition. The LDIF Partition is working just fine with all those changes, and many of the JDBM tests are also running green. The last thing to do is for JDBM to make sure we don't have a collision between reads and write (ie we can't read when we write). That will slow down the server when we do a write, but JDBM won't be able to deal with concurrent reads and writes anyway. Mavibot does not have such limitation, so I expect that Mavibot will become teh de-facto backend soon after those changes. LAst, not least, being able to leverage the Transaction extended operation will allow a fast loading of the server, especially during the initial injection of data: we can do that in memory, and flush teh result globally. I expect all those changes to take a couple of weeks - working on evenings and week-end). I'll keep you posted anyway. -- Emmanuel Lecharny Symas.com directory.apache.org
