Re: Partly done : [ApacheDS] Transaction support

Lucas Theisen Mon, 26 Feb 2018 05:43:11 -0800

This is excellent news.  Great work on this, I can't wait to be able to use
an instance without worrying about corruption.  Thank you for your work on
this!


On Feb 26, 2018 5:27 AM, "Emmanuel Lecharny" <[email protected]> wrote:

> Hi guys,
>
> I'm done with the changes in the server : all the tests are now passing
> green.
>
> That means we have a JDBM based server with operation being atomic. That
> should solve the database corruption we have.
>
> In the process, I also changed the way the API handle requests/responses,
> modifying the operation Future, removing queues that were used expect for
> operations that could take more than one response (Bind, Search and
> ExtendedOperation). We now use a simple wait/notify solution that works
> well. The Cursor implementation has also been slightly modified to remove
> the monitor in it, as it was not sued. Last not least, the operation
> Future.isDone() method is now returning 'true' if we have got an operation
> response. This is important for asunc implementation as it's a not blocking
> way to know if we can go on and fetch the operation (if it's not abandoned,
> of course).
>
> Otherwise, I also have removed a useless feature in the JDBM Parttition :
> we were storing the number of elements in a B-tree beside the B-tree, which
> is totally spurious as teh B-tree already knows about this number : tehre
> is a btree.size() method. This save a few writes.
>
> Overall, as I said, the server is 20% faster, and this is due to saved
> writes and to changes in the way the  LDAP API handles values.
>
> I still have some cleanup to do, but I think we are on a verge of a
> release of teh API and the server, which will also leads to a release of
> Studio :-)
>
> On 2018/02/21 22:01:38, Emmanuel Lecharny <[email protected]> wrote:
> > Hi,
> >
> > a quick update on the on-going work (last week-end and last night).
> >
> > Most of the core-integ tests are passing now (except 3 errors and 4
> failures). FTR, there are 335 instances of OperationContext to check in the
> whole code, many of them using a readTransaction (149).
> >
> > I'm facing 2 issues atm :
> > - first, the Search operation must not close the transaction until the
> associated cursor is closed. This is not currently properly done. The best
> solution I foresee is to store the transaction in the Cursor, and to close
> it when the cursor is closed. That would imply we may face some issue if
> the cursor remains open for a long time.
> > - second, as we don't have a file per index anymore, the JdbmPartition
> initialization is now problematic : we are checking the presence of such
> files on disk to see if we have to re-create the missing indexes. IMO, this
> is a wrong thing to do : we should check the configuration and create the
> index accordingly if the index aren't present in the database. That also
> mean we should check in the database, something we currently don't do. This
> is something I have to fix, it's blocking some of the tests (and likely the
> server won't restart, too).
> >
> > Otherwise, I have checked all the operations, before the
> OperationManager and after it, the transaction seems to be properly
> initiated and committed or aborted.
> >
> > I'm not far from something that works...
> >
> > More later.
> >
> > On 2018/02/17 17:26:05, Emmanuel LÃ©charny <[email protected]> wrote:
> > > Hi,
> > >
> > > this is an update on the on-going work.
> > >
> > > I have started to add support of transaction in JDBM first, using the
> > > changes I have applied last week.
> > >
> > > Actually, JDBM supports transaction natively (to some extend), and teh
> > > big mistake we have done was to limit its use on a per-index/table
> > > level. Which means each index and the master table are updated within
> > > its own transaction, which is clearly not good when we have to make
> each
> > > LDAP operation atomic.
> > >
> > > So the first big change is that I use one single transaction manager
> > > across all the index and the master table : we commit an operation
> > > globally. I do expect that it will make it safer (ie, no concurrent
> > > updates of indexes, leading to some database corruption).
> > >
> > > A typical usage looks like (in OperationManager) :
> > >
> > > public void add( AddOperationContext addContext ) throws LdapException
> > > {
> > >     ...
> > >     // Find the working partition
> > >     Partition partition =
> > > directoryService.getPartitionNexus().getPartition( dn );
> > >     addContext.setPartition( partition );
> > >     ...
> > >
> > >     lockWrite();
> > >
> > >     // Start a Write transaction right away
> > >     PartitionTxn transaction = null;
> > >
> > >     try
> > >     {
> > >         transaction = partition.beginWriteTransaction();
> > >         addContext.setTransaction( transaction );
> > >
> > >         head.add( addContext );
> > >         transaction.commit();
> > >     }
> > >     catch ( LdapException | IOException e )
> > >     {
> > >         try
> > >             {
> > >             if ( transaction != null )
> > >             {
> > >                 transaction.abort();
> > >             }
> > >
> > >             throw new LdapOtherException( e.getMessage(), e );
> > >         }
> > >         catch ( IOException ioe )
> > >         {
> > >             throw new LdapOtherException( ioe.getMessage(), ioe );
> > >         }
> > >     }
> > >     finally
> > >     {
> > >         unlockWrite();
> > >     }
> > >
> > > It sounds a bit convoluted, but what we do is :
> > >
> > > - get the Partition we are working on at this level instead of waiting
> > > to be in the Parition itself. That makes sense because an update
> > > operation (even a move) can't be done across partitions. Note that we
> > > don't fetch the Partition in the Nexus, as we already have it
> > >
> > > - then we ask the selected partition to start a write transaction. What
> > > is interesting here is that not all the partition will create this
> > > transaction. Typically, the Schema partition won't - it makes no sense
> > > -, nor the config partition : they have in-memory indexes, and the
> > > master table is a LDIF file, it's either properly updated or not.
> > >
> > > - the OperationContext stores the transaction and the partition : the
> > > transaction may be reused by interceptors (typically, we may have to
> > > lookup some other entries, or updates entries while processing the
> > > on-going peration)
> > >
> > > - then we call the first interceptor, down to the nexus.
> > >
> > > - If all goes well, the transaction is committed, otherwise, it's
> aborted
> > >
> > >
> > > For JDBM, a commit looks like that :
> > >
> > >     public void commit() throws IOException
> > >     {
> > >         recordManager.commit();
> > >
> > >         // And flush the journal
> > >         BaseRecordManager baseRecordManager = null;
> > >
> > >         if ( recordManager instanceof CacheRecordManager )
> > >         {
> > >             baseRecordManager = ( ( BaseRecordManager ) ( (
> > > CacheRecordManager ) recordManager ).getRecordManager() );
> > >         }
> > >         else
> > >         {
> > >             baseRecordManager = ( ( BaseRecordManager ) recordManager
> );
> > >         }
> > >
> > >
> > >         if ( syncOnWrite )
> > >         {
> > >             baseRecordManager.getTransactionManager().
> synchronizeLog();
> > >         }
> > >     }
> > >
> > > - First the RecordManager is committed, which flush the updates into
> the
> > > Log file (first disk write). Up to this point, all the updates are done
> > > in memory.
> > >
> > > - then we ask the RecordManager to update the databse from the log.
> > >
> > > And that's it.
> > >
> > > Aborting the operaiton is way simpler :
> > >
> > >     public void abort() throws IOException
> > >     {
> > >         recordManager.rollback();
> > >     }
> > >
> > >
> > > I have removed all the sync() methods called all over the code, they
> are
> > > totally useless now.  Bottom line, we end up with having one single
> file
> > > storing the data plus a log file containing the on going updates. If
> the
> > > server crashes, I expect the server to catch up from the journal (still
> > > have to check that aspect).
> > >
> > > Extra benefit : it's *WAY* faster. The perf test I have shows that
> > > adding entries in the server using this code is 2 times faster than
> with
> > > the previous approach (ok, don't get too excited, I'm talking about 180
> > > add/s vs 98 add/s)
> > >
> > >
> > > There are a lot to do still. The Read transactions also need to be
> dealt
> > > with. A typicall read looks lik that :
> > >
> > >     ...
> > >     LookupOperationContext lookupContext = new LookupOperationContext(
> > >   adminSession, modifyContext.getDn(), "+", "*" );
> > >     lookupContext.setPartition( modifyContext.getPartition() );
> > >     lookupContext.setTransaction( modifyContext.getTransaction() );
> > >
> > >     entry = directoryService.getPartitionNexus().lookup(
> lookupContext );
> > >     ...
> > >
> > > here we reuse the modify operation's transaction.
> > >
> > > Or :
> > >
> > >     LookupOperationContext lookupContext = new LookupOperationContext(
> > > adminSession, lookupContext.setPartition( partition );
> > >
> > >     try ( PartitionTxn partitionTxn = partition.beginReadTransaction()
> )
> > >     {
> > >         lookupContext.setTransaction( partitionTxn );
> > >         adminGroup = nexus.lookup( lookupContext );
> > >     }
> > >     catch ( IOException ioe )
> > >     {
> > >         throw new LdapOtherException( ioe.getMessage(), ioe );
> > >     }
> > >
> > > Here, we create a new Read transaction in a try-with-resource call,
> this
> > > transaction will be automatically closed. Note we don't have to commit
> > > or abort it.
> > >
> > >
> > > So now, I'm fixing the tests, changing all the places in the code where
> > > I need to start a read transaction (around 80 places for the lookup
> > > operation, for instance, then there is search, hasEntry, etc...),
> > > properly encapsulating the update operations. That will take time...
> > >
> > > Once it will work for JDBM, I will have to get it to work for Mavibot.
> > >
> > > Still, this is encouraging.
> > >
> > > have a nice week-end.
> > >
> > > --
> > > Emmanuel Lecharny
> > >
> > > Symas.com
> > > directory.apache.org
> > >
> > >
> >
>

Re: Partly done : [ApacheDS] Transaction support

Reply via email to