Re: [xwiki-devs] [proposal] Introduce a new persistence engine.

Denis Gervalle Thu, 20 Jan 2011 12:30:16 -0800

On Thu, Jan 20, 2011 at 18:50, Caleb James DeLisle <[email protected]
> wrote:


>
>
> On 01/20/2011 12:33 PM, Denis Gervalle wrote:
> > On Thu, Jan 20, 2011 at 17:15, Caleb James DeLisle <
> [email protected]
> >> wrote:
> >
> >>
> >>
> >> On 01/20/2011 02:42 AM, Denis Gervalle wrote:
> >>> On Wed, Jan 19, 2011 at 19:54, Caleb James DeLisle <
> >> [email protected]
> >>>> wrote:
> >>>
> >>>>
> >>>>
> >>>> On 01/19/2011 12:13 PM, Vincent Massol wrote:
> >>>>> Hi Caleb,
> >>>>>
> >>>>> I see you're excited, that's good! :)
> >>>>>
> >>>>> Some general comments:
> >>>>> * This looks more like a design for a transaction module than for a
> >>>> persistence engine. I don't see anything related to persistence in
> your
> >>>> proposal below. Your proposal could work on stuff others than storage,
> >>>> right?
> >>>> Yes, this proposal only covers the transaction sub-module of the
> >>>> persistence engine. The so far
> >>>> un-proposed modules include xwiki-store-serialization,
> >>>> xwiki-store-filesystem and a legacy
> >>>> attachment storage module: xwiki-store-filesystem-attachments.
> >>>>
> >>>> Anything which requires transactions could use the TransactionRunnable
> >>>> although I'm at a loss to
> >>>> think of anything other than storage which would require transactions.
> >>>>
> >>>>> * I was expecting to see some Store/Storage/Persistence interfaces
> with
> >>>> proposed APIs and explanation on how they could be implemented both
> with
> >>>> Hibernate and JCR for example. And the relationship with the proposed
> >> new
> >>>> Model defined.
> >>>> I don't like to propose an interface until I have tried to implement
> it.
> >>>> Also I do not like to
> >>>> propose an implementation until I have tried to use it. At this point
> >> it's
> >>>> far enough off that I
> >>>> would rather wait than propose APIs blind.
> >>>>
> >>>> My experience with attachment store has shown that what we want is a
> set
> >> of
> >>>> functions which provide
> >>>> TransactionRunnables to do various things:
> >>>> aka:
> >>>> TransactionRunnable<T>
> getDocumentSaveTransactionRunnable(XWikiDocument
> >>>> toSave);
> >>>> In a hibernate implementation it would return
> >>>> TransactionRunnable<HibernateTransaction> and in a JCR
> >>>> it would return TransactionRunnable<JCRTransaction>.
> >>>>
> >>>> We cannot have APIs like this until TransactionRunnable is agreed upon
> >>>> these will return instances
> >>>> of it.
> >>>>
> >>>>> * I was also expecting a strategy defined to migrate users from the
> >>>> current implementation of the storage to the new one
> >>>> IMO we should change the persistence engine and implement the same
> >> schema,
> >>>> once the persistence
> >>>> engine is rebuilt, then we can consider modifying the schema. The
> schema
> >> is
> >>>> a specification, it may
> >>>> not be perfect but it is something to comply with. It is important to
> me
> >>>> that a we prove that a new
> >>>> persistence engine is able comply with existing specifications before
> we
> >>>> start designing new ones
> >>>> around it.
> >>>>
> >>>>>
> >>>>> I noticed some discussions between Denis and you on IRC about all
> this.
> >>>> Does you latest findings change the proposal below?
> >>>> Everything proposed still holds true but I did add 2 new features.
> >>>>
> >>>> 1. There is a way for a TransactionRunnable<DBTransaction> to be
> passed
> >> an
> >>>> instance of DBTransaction
> >>>> using a new method called getContext().
> >>>>
> >>>> 2. There is a new class which serves what I believe is an edge use
> case.
> >>>> Suppose you want to define a TransactionRunnable (we will call it
> >>>> YourTransactionRunnable) which
> >>>> must run inside of a DBTransaction but it must _also_ run after an
> >> instance
> >>>> of MyTransactionRunnable.
> >>>> You can make MyTransactionRunnable a
> >>>> "ProvidingTransactionRunnable<DBTransaction, MyInterface>" and
> >>>> then MyTransactionRunnable must run inside of a DBTransaction and we
> >> define
> >>>> YourTransactionRunnable
> >>>> as a TransactionRunnable<MyInterface>. This also allows
> >>>> MyTransactionRunnable to share information
> >>>> since YourTransactionRunnable.getContext() will provide an
> >> implementation
> >>>> of MyInterface. Of course
> >>>> this feature must be used with care as it provides the tools to write
> >>>> horrible constructs but IMO it
> >>>> is the type of feature which when you need it, there is no other way
> >>>> around.
> >>>>
> >>>
> >>> I just need to add that we have not been able to apply TR and PTR in
> >>> particular to a simple store with a JDBC connected database. From our
> >>> experience, the PTR cause more issue than it solve and we have to find
> a
> >>> better way to convey datas between transactions of a given transaction
> >>> chain. Data dependencies between transaction could not be easily solved
> >> at
> >>> compile time since there is many combinatory situation in real life.
> >>>
> >>> My suggestion would be to provide access to previous transaction using
> a
> >>> hash of Interfaces exposed by previous transaction. Checking
> availability
> >> of
> >>> needed interfaces could be done by transactions à preRun time which
> could
> >>> avoid the need of uselessly running the whole chain, if there is a
> >>> dependency problem.
> >>
> >> What you can do is make DBTransaction (the context) extend Map and then
> put
> >> things in that map. I
> >> really don't like that path though since is means that TR3 uses things
> >> provided by TR1 and doesn't
> >> declare that need. It breaks atomicity by sharing data between
> >> transactions, TR3 can't just be
> >> plugged in somewhere else where TR1 is not used. It also breaks breaks
> >> compile time safety, plugging
> >> TR3 in somewhere else (or reordering them) will compile but fail to run.
> >> That pattern doesn't really afford you any of the benefits of TR other
> than
> >> the try, catch safety
> >> so, to me, it makes sense in that case for the code for TR1, TR2, and
> TR3
> >> to all be inside of the
> >> same TransactionRunnable.
> >>
> >>>
> >>> I also doubt that the way of multiple transactions are executed in a
> same
> >>> single one is correct or useful. Would really prefer to see all
> >> transactions
> >>> run at a given level before going down the chain.
> >>
> >> I don't see this as making sense. My goal is to make sure when something
> >> goes wrong it fails as
> >> early as possible so there is the least amount to rollback.
> >>
> >> SaveDocumentRunnable
> >>  |
> >>  +--SaveObjectRunnable
> >>  |   |
> >>  |   +--SavePropertyRunnable EXCEPTION
> >>  |   +--SavePropertyRunnable <-- Doesn't run.
> >>  |
> >>  +--SaveObjectRunnable <-- doesn't run
> >>  |
> >>  etc. Doesn't run.
> >>
> >> I'm interested to know what is the rationale for wanting them to run one
> >> level then the next?
> >> IE: save document, then all objects then all properties.
> >>
> >>> This would provide the way
> >>> to bundle transaction (and even transaction chain) together by running
> >> them
> >>> in a single transaction. This would also helps not mixing dependencies
> >> since
> >>> currently there is an implicit availability of earlier sibling chain of
> >>> transactions that does not fit the idea of a context in evolution
> checked
> >> at
> >>> compile time and could provide unchecked implicit dependencies.
> >>
> >> I still don't quite follow, can you give a real world example?
> >>
> >
> > Well, let me base my real world example on your sample above.
> > Imagine that documents are linked to their objects using their primary
> key,
> > PK which is generated by the underlying database when you records the
> > document for the first time, and that we needs creating a new documents
> with
> > Objects in a single transaction. Your SaveObjectRunnable will need that
> PK
> > as a FK, how does it get it? Or said in another way, it needs a context
> that
> > provide the PK of the document for which it is saving the objects.
>
> If the object knows what document it belongs to couldn't you use:
> object.getDocument().getKey() ?
> I think allowing the document to know it's primary key is less ugly than
> passing everything around
> in a map.
>

For sure, this is a contrive example which I try to use to exposed what I
means. It is not perfect, just too simple in fact.


> Alternatively you could use:
> UPDATE object SET object.foreignKey=(SELECT doc.id where doc.name =
> <object.getDoc().getName()>) or
> the like. It's harder on the DB but it will make each TR modular and
> atomic.
>

But there is no assurance it will works since you have never say that the
document should be created to save its objects.


>
> > Well, lets insert a CreateDocumentRunnable before it, that may have some
> > lower TR to complete, like saving document and retrieving the generated
> PK.
> > How does this CreateDocumentRunnable is maybe a sibling
> > of SaveObjectRunnable provide the PK to it ?
>
> I don't understand why the SaveObjectRunnable would be a sibling and of the
> SaveDocumentRunnable. If
> the Object is part of the document then shouldn't it be a child?
>

No, SaveObjectRunnable will not be sibling of the SaveDocumentRunnable, it
will be a child of it, for sure, but it could be a sibling of let say
CreateDocumentRunnable which have itself some children for doing its own
job.


> > How do you check
> > that SaveObjectRunnable requirement to have that PK is fulfilled ? Have
> you
> > an idea of the structure needed for this example ?
>
> IMO siblings should never depend on one another, if the TR needs something
> from the last TR then it
> really needs to be a child of it. this also has the benefit that you can
> pass around a TR for saving
> a document and it will have all TRs for saving each object and each
> property inside of it.
>

To say it an other way, saving a document, could be divided in several
pieces, and there is no assurance that you could always convert that in
parent child relationships, especially when a third transaction depends on
the result of 2 previous one that are not dependent from each other but are
bundled together in a third one. That would be a perfect world if you could
really expect that all dependencies are one to many, and never many to one.

   A
 /    \
B    C
 \    /
   D

A is composed of B and C, and D depends on A.

Thank you, I really appreciate your review.
>

Thanks ;), I really hope it could make it better !

Denis


>
> Caleb
>
> >
> > Denis
> >
> >
> >>>
> >>> To conclude, this is a very interesting proposal, that needs more
> >>> refinements before being used wildly in all situation requiring
> >>> transactional processing.
> >>
> >>
> >>> Nice idea and good job Caleb !
> >>
> >> Thanks :)
> >>
> >> Caleb
> >>
> >>>
> >>> Denis
> >>>
> >>>
> >>>
> >>>>
> >>>> Caleb
> >>>>
> >>>>>
> >>>>> Thanks
> >>>>> -Vincent
> >>>>>
> >>>>> On Jan 10, 2011, at 2:15 PM, Caleb James DeLisle wrote:
> >>>>>
> >>>>>> Hi,
> >>>>>> I have been working hard on filesystem attachments and I found that
> >>>> synchronizing manual filesystem
> >>>>>> transactions with automatic database transactions was a difficult
> job
> >>>> and I needed a new tool to do
> >>>>>> it. So I wrote what I am proposing to be the next XWiki Persistence
> >>>> Engine.
> >>>>>>
> >>>>>> I'll start off with the fun part of the proposal, I have been
> calling
> >> it
> >>>> xwiki-store so far but I am
> >>>>>> so excited about the capabilities of this engine that I don't think
> it
> >>>> does it justice to name it
> >>>>>> "store" after the place on the corner with milk and eggs. I am
> >> proposing
> >>>> it be named "XWiki
> >>>>>> Persistence Engine", the directory will be renamed
> xwiki-persistence,
> >>>> the artifact name
> >>>>>> xwiki-core-persistence, and the package name org.xwiki.persistence.
> >>>> Persistence is an attribute of
> >>>>>> castles, mountains and redwood trees which I think is fitting for a
> >>>> conservatively designed storage
> >>>>>> engine.
> >>>>>>
> >>>>>> Now a little explanation of what I'm so excited about:
> >>>>>> The common and error prone way of saving things in the database is
> to
> >>>> open a transaction, enter a
> >>>>>> try clause, do something then commit. If we catch an exception, then
> >> we
> >>>> rollback.
> >>>>>> something like this:
> >>>>>>
> >>>>>> begin transaction;
> >>>>>> try {
> >>>>>>   do something;
> >>>>>>   do something else;
> >>>>>>   commit;
> >>>>>> } catch (Any exception which may occur) {
> >>>>>>   rollback;
> >>>>>> }
> >>>>>>
> >>>>>> There are 3 things which can go wrong. 1 we forget to begin the
> >>>> transaction, 2 we forget to commit
> >>>>>> and 3 we do not rollback properly. What makes things worse is often
> >> the
> >>>> database will "assume we
> >>>>>> meant to..." and things will work ok most of the time which makes
> >> things
> >>>> much worse because bugs
> >>>>>> will hide very well.
> >>>>>>
> >>>>>> My answer to this problem is a class called TransactionRunnable. It
> >>>> provides a set of 5 empty
> >>>>>> methods to override: onPreRun(), onRun(), onCommit(), onRollback(),
> >> and
> >>>> onComplete(). the exact
> >>>>>> circumstances under which each are called is documented in the
> javadoc
> >>>> comments here:
> >>>>>>
> >>>>
> >>
> http://svn.xwiki.org/svnroot/xwiki/contrib/sandbox/xwiki-store/xwiki-store-transaction/src/main/java/org/xwiki/store/TransactionRunnable.java
> >>>>>> I wrote TransactionRunnable twice, I wrote it, used it for
> >> attachments,
> >>>> then after having real
> >>>>>> experience as a user, I wrote it again.
> >>>>>>
> >>>>>> To repeat our original example with TransactionRunnable you might
> say
> >>>> this:
> >>>>>>
> >>>>>> public class DoSomethingTransactionRunnable extends
> >> TransactionRunnable
> >>>>>> {
> >>>>>>   public void onRun()
> >>>>>>   {
> >>>>>>     do something;
> >>>>>>     do something else;
> >>>>>>   }
> >>>>>> }
> >>>>>>
> >>>>>> Now we can use another TransactionRunnable which opens and closes
> the
> >>>> transaction for us.
> >>>>>>
> >>>>>> StartableTransactionRunnable transaction = new
> >>>> HibernateTransactionRunnable();
> >>>>>> new DoSomethingTransactionRunnable().runIn(transaction);
> >>>>>> transaction.start();
> >>>>>>
> >>>>>> the runIn() function allows us to run one TransactionRunnable inside
> >> of
> >>>> another. Supposing we wanted
> >>>>>> to reuse "do something else" in other places, we can make it a
> >> separate
> >>>> TransactionRunnable and use
> >>>>>> the runIn() function to hook it to our
> DoSomethingTransactionRunnable
> >>>> ie:
> >>>>>>
> >>>>>> public class DoSomethingTransactionRunnable extends
> >> TransactionRunnable
> >>>>>> {
> >>>>>>   public DoSomethingTransactionRunnable()
> >>>>>>   {
> >>>>>>     new DoSomethingElseTransactionRunnable().runIn(this);
> >>>>>>   }
> >>>>>> ..
> >>>>>>
> >>>>>> The only limitations on running TransactionRunnables inside of one
> >>>> another are they cannot run more
> >>>>>> than once and they cannot call themselves (this would be an infinite
> >>>> loop).
> >>>>>>
> >>>>>> This pattern makes each job which is done on storage easily isolated
> >>>> and, as I have so far
> >>>>>> experienced, trivial to test. However, it still leaves the
> possibility
> >>>> that we might forget that
> >>>>>> DoSomethingTransactionRunnable must be run inside of a hibernate
> >>>> transaction. I have devised a
> >>>>>> solution for this too. Using generics, I offered a means for the
> >> author
> >>>> of a TransactionRunnable to
> >>>>>> communicate to the compiler what other TransactionRunnable their
> >>>> runnable must be run in and without
> >>>>>> explicit casting or defining of an intermediary runnable, this
> >>>> requirement cannot be violated or
> >>>>>> else it wouldn't compile!
> >>>>>>
> >>>>>> Finally we have the issue of starting the runnable. Who's to say I
> >> won't
> >>>> be tired one day and just
> >>>>>> write new DoSomethingTransactionRunnable().start() without opening a
> >>>> transaction first? If
> >>>>>> DoSomethingTransactionRunnable cannot be safely run outside of a
> >>>> transaction all it needs to do is
> >>>>>> not extend StartableTransactionRunnable and it won't have any start
> >>>> function.
> >>>>>>
> >>>>>> I have taken a multitude of very easy mistakes and given the author
> of
> >> a
> >>>> TransactionRunnable the
> >>>>>> tools to make it very hard for the user to make them. Also, since a
> >>>> TransactionRunnable has no
> >>>>>> reason to be very long (it can just branch off into another
> runnable)
> >>>> this will make testing and
> >>>>>> code review easy in the place where it is most important. This part
> of
> >>>> the code is entirely generic
> >>>>>> and has no dependence on hibernate or anything else.
> >>>>>>
> >>>>>> I propose we move:
> >>>>>> contrib/sandbox/xwiki-store/xwiki-store-transaction/
> >>>>>> to:
> >>>>>> platform/core/xwiki-persistence/xwiki-persistence-transaction
> >>>>>>
> >>>>>> And I will propose moving each additional piece in the coming days.
> >>>>>>
> >>>>>> WDYT?
> >>>>>>
> >>>>>> Caleb
> >>>>>
> >>>>> _______________________________________________
> >>>>> devs mailing list
> >>>>> [email protected]
> >>>>> http://lists.xwiki.org/mailman/listinfo/devs
> >>>>>
> >>>>
> >>>> _______________________________________________
> >>>> devs mailing list
> >>>> [email protected]
> >>>> http://lists.xwiki.org/mailman/listinfo/devs
> >>>>
> >>>
> >>>
> >>>
> >>
> >> _______________________________________________
> >> devs mailing list
> >> [email protected]
> >> http://lists.xwiki.org/mailman/listinfo/devs
> >>
> >
> >
> >
>
> _______________________________________________
> devs mailing list
> [email protected]
> http://lists.xwiki.org/mailman/listinfo/devs
>



-- 
Denis Gervalle
SOFTEC sa - CEO
eGuilde sarl - CTO
_______________________________________________
devs mailing list
[email protected]
http://lists.xwiki.org/mailman/listinfo/devs

Re: [xwiki-devs] [proposal] Introduce a new persistence engine.

Reply via email to