Re: [xwiki-devs] Transaction API (was Re: [proposal] Introduce a new persistence engine.)

Caleb James DeLisle Thu, 20 Jan 2011 07:57:45 -0800


On 01/20/2011 04:24 AM, Vincent Massol wrote:
> 
> On Jan 20, 2011, at 10:12 AM, Denis Gervalle wrote:
> 
>> On Thu, Jan 20, 2011 at 09:23, Vincent Massol <[email protected]> wrote:
>>
>>> Hi Caleb,
>>>
>>> On Jan 19, 2011, at 7:54 PM, Caleb James DeLisle wrote:
>>>
>>>>
>>>>
>>>> On 01/19/2011 12:13 PM, Vincent Massol wrote:
>>>>> Hi Caleb,
>>>>>
>>>>> I see you're excited, that's good! :)
>>>>>
>>>>> Some general comments:
>>>>> * This looks more like a design for a transaction module than for a
>>> persistence engine. I don't see anything related to persistence in your
>>> proposal below. Your proposal could work on stuff others than storage,
>>> right?
>>>> Yes, this proposal only covers the transaction sub-module of the
>>> persistence engine. The so far
>>>> un-proposed modules include xwiki-store-serialization,
>>> xwiki-store-filesystem and a legacy
>>>> attachment storage module: xwiki-store-filesystem-attachments.
>>>
>>> Ok I was misled by the title of your mail "Introduce a new persistence
>>> engine". I guess it could/should have been "Introduce a generic transaction
>>> API independent of the underlying storage implementation", right?


"Introduce a generic transaction API independent of the underlying storage 
implementation"
so boring compared to "The XWiki Persistence Engine" ;)
Indeed you are correct but this proposal is for the foundation of the engine.

>>>
>>> So if we focus purely on the transaction part here are some questions:
>>>
>>> * Why don't we use existing standards such as JTA/JTS? See
>>> http://en.wikipedia.org/wiki/Java_Transaction_API
The original answer was that I defined my requirements and wrote rather than 
looking for something.
Now looking at the JTA information, I still can't tell if it meets the 
requirements which
TransactionRunnable was designed to meet.

1. Can multiple jobs be chained together inside of a transaction?
  Can each job be made atomic? Can they be tested without mocking?
2. Are try/catch completely handled?
  I notice a lot of transact.begin() and transact.commit() in the code.
3. Is is possible to run a job outside of the proper type of transaction?
  IE: Can a "save document in JCR" job accidentally be run in a Hibernate 
transaction?
4. Can JTA run on any storage platform? Hibernate, JCR, raw JDBC, Filesystem, 
Memcached, etc.

>>> * If we were to define our own API, would we be able to implement it using
>>> JTA, ie is it a higher level transaction API than JTA?
TransactionRunnable is designed to run on top of absolutely anything.

>>> * Imagine we decide to use JCR as the storage implementation, how would
>>> this transaction API integrate with it knowing that JCR integrates with JTA?
>>> (http://www.day.com/specs/jcr/2.0/21_Transactions.html)
Yes, anything. Also I notice in the example code given at that page, they are 
using the same (anti)
pattern. begin, do stuff, commit. That code can't be incorporated into a larger 
transaction or else
it would be rolled back when begin() is called. This is the kind of code which 
begs to be made into
a TransactionRunnable.

>>>
>>>> Anything which requires transactions could use the TransactionRunnable
>>> although I'm at a loss to
>>>> think of anything other than storage which would require transactions.
>>>>
>>>>> * I was expecting to see some Store/Storage/Persistence interfaces with
>>> proposed APIs and explanation on how they could be implemented both with
>>> Hibernate and JCR for example. And the relationship with the proposed new
>>> Model defined.
>>>> I don't like to propose an interface until I have tried to implement it.
>>> Also I do not like to
>>>> propose an implementation until I have tried to use it. At this point
>>> it's far enough off that I
>>>> would rather wait than propose APIs blind.
>>>>
>>>> My experience with attachment store has shown that what we want is a set
>>> of functions which provide
>>>> TransactionRunnables to do various things:
>>>> aka:
>>>> TransactionRunnable<T> getDocumentSaveTransactionRunnable(XWikiDocument
>>> toSave);
>>>> In a hibernate implementation it would return
>>> TransactionRunnable<HibernateTransaction> and in a JCR
>>>> it would return TransactionRunnable<JCRTransaction>.
>>>>
>>>> We cannot have APIs like this until TransactionRunnable is agreed upon
>>> these will return instances
>>>> of it.
>>>>
>>>>> * I was also expecting a strategy defined to migrate users from the
>>> current implementation of the storage to the new one
>>>> IMO we should change the persistence engine and implement the same
>>> schema, once the persistence
>>>> engine is rebuilt, then we can consider modifying the schema. The schema
>>> is a specification, it may
>>>> not be perfect but it is something to comply with. It is important to me
>>> that a we prove that a new
>>>> persistence engine is able comply with existing specifications before we
>>> start designing new ones
>>>> around it.
>>>
>>> I don't quite agree with this since it assumes this schema is universal
>>> which it definitely is not. It's only a schema that works with RDBMS. It
>>> wouldn't work with an ODBMS or a file system implementation.
>>>
>>
>> The proposal of Caleb is not RDBMS related at all. He first use it for the
>> filesystem persistence of attachement.
> 
> I was responding to Caleb's point about keeping the schema...

>From my perspective we need to rewrite the engine so it's not a mess, get the 
>new engine working
well and with all the kinks worked out, then when each job is in a 
TransactionRunnable, we can
easily swap them out for new ones one at a time. To redesign the schema now 
means we need to
interact with the old XWikiHibernateStore code in order to do migration, that's 
not something I'd
like to do.

> 
>>> Also if there's one thing we shouldn't care about it's the schema. It's
>>> supposed to be opaque for the user and the user has to use the storage API
>>> to access it (and never go directly to the DB). In other words we should be
>>> clear that the schema is not part of the API since that would prevent any
>>> modification of it. I don't think we need this barrier.

"be clear that the schema is not part of the API" yes which is why I want to 
change one at a time.
If we design a new API and a new schema at the same time, we risk making an API 
with limitations and
not noticing it because we make a schema which hides those limitations.

>>>
>>> Conclusion:
>>>
>>> We're doing something really difficult here, which is defining a
>>> transaction API without defining the storage implementation we want to use
>>> and thus without defining how this transaction API would integrate with it.
>>> Right now the most common (if not the only one!) known and standard
>>> transaction API in the java world is JTA and most if not all known storage
>>> implementation support it. Thus if we really want a transaction API separate
>>> of the storage implementation I'd be in favor of looking at JTA and see
>>> whether it would fit our needs.
>>
>>
>>> WDYT?
>>>
>>
>> Proposal of Caleb is over anything, including JTA, it is absolutely not a
>> concurrent of JTA. What he proposed is to helps managing transactional
>> processing properly without having to write boring try/finally code and
>> ensuring we will commit or rollback and we do not mix incompatible
>> "sub"-transactions.
> 
> BTW there are other ways of doing this. In JEE servers it's done with 
> annotations (no "boring" try/finally as you say :)). It's the 
> container-managed part of JTA.

Boring but more importantly unsafe.

> 
> Normally transactions should be demarcated at the level of the services so 
> that several calls to the DB can be in the same Tx for example. At least 
> that's what I used to do in my J2EE days... 
> 
>> I would like to see it extended to support transmission
>> of datas between these "sub"-transaction. In regards to a DBMS, there will
>> be a single transactions and what is proposed by Caleb helps allowing it to
>> be customarily subdivide in smaller steps without having to take care of the
>> overall process of committing/rollbacking and collecting errors. You
>> concentrate on the work that should be done, and the rest is taken care by
>> your TransactionRunnables.
>> This is very interesting in my opinion, but this is not yet mature enough to
>> manage all situations.
> 
> Question: Since this looks generic, isn't there frameworks out there that do 
> this? If not, why?]

There is...   Now  ;)

Caleb

> 
> Thanks
> -Vincent
> 
>> Denis
>>
>>
>>>
>>> Thanks
>>> -Vincent
>>>
>>>>>
>>>>> I noticed some discussions between Denis and you on IRC about all this.
>>> Does you latest findings change the proposal below?
>>>> Everything proposed still holds true but I did add 2 new features.
>>>>
>>>> 1. There is a way for a TransactionRunnable<DBTransaction> to be passed
>>> an instance of DBTransaction
>>>> using a new method called getContext().
>>>>
>>>> 2. There is a new class which serves what I believe is an edge use case.
>>>> Suppose you want to define a TransactionRunnable (we will call it
>>> YourTransactionRunnable) which
>>>> must run inside of a DBTransaction but it must _also_ run after an
>>> instance of MyTransactionRunnable.
>>>> You can make MyTransactionRunnable a
>>> "ProvidingTransactionRunnable<DBTransaction, MyInterface>" and
>>>> then MyTransactionRunnable must run inside of a DBTransaction and we
>>> define YourTransactionRunnable
>>>> as a TransactionRunnable<MyInterface>. This also allows
>>> MyTransactionRunnable to share information
>>>> since YourTransactionRunnable.getContext() will provide an implementation
>>> of MyInterface. Of course
>>>> this feature must be used with care as it provides the tools to write
>>> horrible constructs but IMO it
>>>> is the type of feature which when you need it, there is no other way
>>> around.
>>>>
>>>> Caleb
>>>>
>>>>>
>>>>> Thanks
>>>>> -Vincent
>>>>>
>>>>> On Jan 10, 2011, at 2:15 PM, Caleb James DeLisle wrote:
>>>>>
>>>>>> Hi,
>>>>>> I have been working hard on filesystem attachments and I found that
>>> synchronizing manual filesystem
>>>>>> transactions with automatic database transactions was a difficult job
>>> and I needed a new tool to do
>>>>>> it. So I wrote what I am proposing to be the next XWiki Persistence
>>> Engine.
>>>>>>
>>>>>> I'll start off with the fun part of the proposal, I have been calling
>>> it xwiki-store so far but I am
>>>>>> so excited about the capabilities of this engine that I don't think it
>>> does it justice to name it
>>>>>> "store" after the place on the corner with milk and eggs. I am
>>> proposing it be named "XWiki
>>>>>> Persistence Engine", the directory will be renamed xwiki-persistence,
>>> the artifact name
>>>>>> xwiki-core-persistence, and the package name org.xwiki.persistence.
>>> Persistence is an attribute of
>>>>>> castles, mountains and redwood trees which I think is fitting for a
>>> conservatively designed storage
>>>>>> engine.
>>>>>>
>>>>>> Now a little explanation of what I'm so excited about:
>>>>>> The common and error prone way of saving things in the database is to
>>> open a transaction, enter a
>>>>>> try clause, do something then commit. If we catch an exception, then we
>>> rollback.
>>>>>> something like this:
>>>>>>
>>>>>> begin transaction;
>>>>>> try {
>>>>>> do something;
>>>>>> do something else;
>>>>>> commit;
>>>>>> } catch (Any exception which may occur) {
>>>>>> rollback;
>>>>>> }
>>>>>>
>>>>>> There are 3 things which can go wrong. 1 we forget to begin the
>>> transaction, 2 we forget to commit
>>>>>> and 3 we do not rollback properly. What makes things worse is often the
>>> database will "assume we
>>>>>> meant to..." and things will work ok most of the time which makes
>>> things much worse because bugs
>>>>>> will hide very well.
>>>>>>
>>>>>> My answer to this problem is a class called TransactionRunnable. It
>>> provides a set of 5 empty
>>>>>> methods to override: onPreRun(), onRun(), onCommit(), onRollback(), and
>>> onComplete(). the exact
>>>>>> circumstances under which each are called is documented in the javadoc
>>> comments here:
>>>>>>
>>> http://svn.xwiki.org/svnroot/xwiki/contrib/sandbox/xwiki-store/xwiki-store-transaction/src/main/java/org/xwiki/store/TransactionRunnable.java
>>>>>> I wrote TransactionRunnable twice, I wrote it, used it for attachments,
>>> then after having real
>>>>>> experience as a user, I wrote it again.
>>>>>>
>>>>>> To repeat our original example with TransactionRunnable you might say
>>> this:
>>>>>>
>>>>>> public class DoSomethingTransactionRunnable extends TransactionRunnable
>>>>>> {
>>>>>> public void onRun()
>>>>>> {
>>>>>>   do something;
>>>>>>   do something else;
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> Now we can use another TransactionRunnable which opens and closes the
>>> transaction for us.
>>>>>>
>>>>>> StartableTransactionRunnable transaction = new
>>> HibernateTransactionRunnable();
>>>>>> new DoSomethingTransactionRunnable().runIn(transaction);
>>>>>> transaction.start();
>>>>>>
>>>>>> the runIn() function allows us to run one TransactionRunnable inside of
>>> another. Supposing we wanted
>>>>>> to reuse "do something else" in other places, we can make it a separate
>>> TransactionRunnable and use
>>>>>> the runIn() function to hook it to our DoSomethingTransactionRunnable
>>> ie:
>>>>>>
>>>>>> public class DoSomethingTransactionRunnable extends TransactionRunnable
>>>>>> {
>>>>>> public DoSomethingTransactionRunnable()
>>>>>> {
>>>>>>   new DoSomethingElseTransactionRunnable().runIn(this);
>>>>>> }
>>>>>> ..
>>>>>>
>>>>>> The only limitations on running TransactionRunnables inside of one
>>> another are they cannot run more
>>>>>> than once and they cannot call themselves (this would be an infinite
>>> loop).
>>>>>>
>>>>>> This pattern makes each job which is done on storage easily isolated
>>> and, as I have so far
>>>>>> experienced, trivial to test. However, it still leaves the possibility
>>> that we might forget that
>>>>>> DoSomethingTransactionRunnable must be run inside of a hibernate
>>> transaction. I have devised a
>>>>>> solution for this too. Using generics, I offered a means for the author
>>> of a TransactionRunnable to
>>>>>> communicate to the compiler what other TransactionRunnable their
>>> runnable must be run in and without
>>>>>> explicit casting or defining of an intermediary runnable, this
>>> requirement cannot be violated or
>>>>>> else it wouldn't compile!
>>>>>>
>>>>>> Finally we have the issue of starting the runnable. Who's to say I
>>> won't be tired one day and just
>>>>>> write new DoSomethingTransactionRunnable().start() without opening a
>>> transaction first? If
>>>>>> DoSomethingTransactionRunnable cannot be safely run outside of a
>>> transaction all it needs to do is
>>>>>> not extend StartableTransactionRunnable and it won't have any start
>>> function.
>>>>>>
>>>>>> I have taken a multitude of very easy mistakes and given the author of
>>> a TransactionRunnable the
>>>>>> tools to make it very hard for the user to make them. Also, since a
>>> TransactionRunnable has no
>>>>>> reason to be very long (it can just branch off into another runnable)
>>> this will make testing and
>>>>>> code review easy in the place where it is most important. This part of
>>> the code is entirely generic
>>>>>> and has no dependence on hibernate or anything else.
>>>>>>
>>>>>> I propose we move:
>>>>>> contrib/sandbox/xwiki-store/xwiki-store-transaction/
>>>>>> to:
>>>>>> platform/core/xwiki-persistence/xwiki-persistence-transaction
>>>>>>
>>>>>> And I will propose moving each additional piece in the coming days.
>>>>>>
>>>>>> WDYT?
>>>>>>
>>>>>> Caleb
> _______________________________________________
> devs mailing list
> [email protected]
> http://lists.xwiki.org/mailman/listinfo/devs
> 

_______________________________________________
devs mailing list
[email protected]
http://lists.xwiki.org/mailman/listinfo/devs

Re: [xwiki-devs] Transaction API (was Re: [proposal] Introduce a new persistence engine.)

Reply via email to