Re: Java Client/Adbera (was: Re: CMIS Client API)

Bogdan Stefanescu Fri, 05 Jun 2009 02:11:28 -0700


Hi all,

I will add more about the reasons on why the existing atom pub clientis not using Abdera.

As Florent said, the existing client was written in a hurry for aclient and it was not aligned yet with last chemistry API.The objective was to build a very responsive rich client (based oneclipse RCP framework) to be used to view/edit/publish stories bypeople writing news.Here are two requirements on the application (to have an idea aboutthe application constraints - e.g. performance / memory /responsiveness constraints):- People should be able to fetch news raw text (by using atom client),create / edit news and publish them as faster as possible (severalsecond to create and publish a news)- The application should display many views on remote feeds (loadedusing the atom client)

- Feed views must be refreshed in a 2 seconds interval.

Initially, I started using abdera but for several reasons (that Iexplain below) I decided that it was not appropriate for the type ofapplication I wanted to build.So, lets see how can we use abdera to build a chemistry object modelimplementation for an atom client. We have 2 choices:

1. Either you wrap abdera objects in your own chemistry objects

2. Either you use abdera to parse the feed and build your chemistryobjects that are totally detached from abdera objects.


Let discuss both of these 2 approaches. I will start with the worst one.

1. Wrap abdera objects in  chemistry object
This was my first approach. Here are the pros/cons:
Pros:

- Simplify a bit the implementation of the chemistry model - atomvalidation included. No need to use Stax (or SAX) code to read yourobjects from the remote feed.

In fact the simplification added by abdera is relative. You stillneed to write code to parse your CMIS objects from abdera DOM.Feed and entry parsing are anyway not complicated (atom is a niceand simple format).So the only thing abdera is really providing is atom validation andan atom aware XML DOM. The rest should be implemented anyway (like theCMIS Object parsing from the abdera DOM).If you don't want atom validation then using another XML DOM librarywill be the same from chemistry code perspective - where you need toparse the CMIS object.Here is a link to benchmarks on several XML DOM parsers includingAXIOM (the one used by Abdera):

http://www.xml.com/lpt/a/1703


Cons:

- Add to your application many extra dependencies (If I remember well3 or 4 abdera JARs + 2 axiom JARs)- Your CMIS objects will be larger (embed Abdera objects whichcontains additional data not used by the CMIS code).

- Debug is difficult.

An annoying side effect is that debug becomes difficult. When you areintrospecting CMIS objects (that wraps abdera objects) you will needto introspect abdera objects that are based on AXIOM model which is alazy DOM model (it is reading XML data into the DOM object only whenrequired). To understand what your object contain you need tounderstand the AXIOM model.

- A technical issue I had with AXIOM way of doing things.
I will describe it here:

As mentioned above, AXIOM is loading data from XML into the DOM onlyat client demand. For example if the client don't need to access the30th entry in the feed the data of that entry will not be read fromXML input stream.This is a very interesting AXIOM feature that I like but this featurehas a side effect in my application case.Because AXIOM read the input stream only at demand it requires to havethe input stream opened until you read all the data you want from thestream.This means if you close the stream before the UI is completely updatedyou will have an exception like this one:

Exception in thread "main" org.apache.abdera.parser.ParseException:java.lang.RuntimeException: [was class java.io.IOException] Streamclosed

        at org.apache.abdera.parser.stax.FOMBuilder.next(FOMBuilder.java:260)

atorg.apache.axiom.om.impl.llom.OMElementImpl.getNextOMSibling(OMElementImpl.java:265)atorg.apache.axiom.om.impl.traverse.OMChildrenQNameIterator.next(OMChildrenQNameIterator.java:93)atorg.apache.abdera.parser.stax.util.FOMElementIteratorWrapper.next(FOMElementIteratorWrapper.java:41)

        at org.apache.abdera.parser.stax.util.FOMList.buffer(FOMList.java:74)
        at org.apache.abdera.parser.stax.util.FOMList.size(FOMList.java:88)

atorg.nuxeo.chemistry.client.app.test.TestAbderaConn.parseWithAbdera(TestAbderaConn.java:61)atorg.nuxeo.chemistry.client.app.test.TestAbderaConn.main(TestAbderaConn.java:39)Caused by: java.lang.RuntimeException: [was class java.io.IOException]Stream closedatcom.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18)

        at com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:706)

atcom.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3655)at com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809)atorg.apache.axiom.om.impl.builder.StAXBuilder.createOMText(StAXBuilder.java:245)atorg.apache.axiom.om.impl.builder.StAXBuilder.createOMText(StAXBuilder.java:216)atorg.apache.abdera.parser.stax.FOMBuilder.applyTextFilter(FOMBuilder.java:158)

        at org.apache.abdera.parser.stax.FOMBuilder.next(FOMBuilder.java:206)
        ... 7 more

And in order to have a responsive application you need to update afeed refresh asynchronously from the UI thread. You cannot controlwhen the stream is closed and when the UI is completely loaded.

So, this cool feature of AXIOM makes the AXIOM DOM unusable (orhardly usable) in live objects displayed in rich client applications.


Let's look at the second option of integrating abdera

2. Using Abdera to parse the feed and build your chemistry objectsthat are totally detached from abdera objects.


Pros:
- Provides ATOM validation and ATOM oriented DOM objects.

- Easy way to parse feeds and CMIS objects using the high level AdberaDOM.- Efficient way of parsing the feed input stream due to "load ondemand" feature of AXIOM. (sections in the feed you are not interestedin will not be loaded in memory)


Cons:

- Extra dependencies required by the applications as mentioned above(~6 extra JARs)

This is an acceptable approach. The only issue for me are the extradependencies. To parse an atom feed you need 6 jars!?It's true that for a chemistry client - parsing the feed is notinteresting at all. The client code must concentrate on implementingthe chemistry model in an efficient way to be able to use that clientin both simple administration applications and highly responsive richclient applications.

My problem is that correctly parsing an atom feed with the focus onCMIS objects can be done using Stax in a very efficient way and onlyby writing several classes (no more than 10).

So after thinking more on this I also envisaged to use only AXIOM.(without abdera). With only 2 extra jars I am able to efficiently loadmy feeds. But I lose the atom model.

Anyway finally I realized that by only writing a few helper classesover Stax I am able to do a more high level parsing in the style AXIOMis doing - so I adopted this extreme approach. :)

May be my lists with pros and cons is not complete - but anyway it mayhelp you in adopting a solution.Personally, If we absolutely want abdera I will vote for solution 2.since 1. is not acceptable for my use cases.If Abdera is not required then we can either use AXIOM, eitherdirectly use Stax API as in the current client.



Regards,
Bogdan


On 4 juin 09, at 16:25, Florent Guillaume wrote:

On 4 Jun 2009, at 15:34, Gabriele Columbro wrote:
G'day Chemicals,
as I finally found some time to spend on the mighty Chemistry, Iwas able to go trough the ongoing mail threads and look a littlebit better at the status of the Chemistry codebase (with an eye onwhich parts of Alfresco that may be suitable for contribution).
I would like to start working a bit on the client / TCK / buildautomation part of the project, but, before discussing the detailswith you guys and get into action, I saw a couple of open mailthreads (forwarded one and [4]) on a topic that can impact a lotthe way I can contribute to this project:
I'm talking about the implementation of the AtomPub Java Client.
As I understand Florent is working on the AtomPub Java Client andIIUC it isn't going to be based on Abdera. Though I could not findyet any code in SVN (@Florent: nor in the Nuxeo HG 'default' [1]revision, am I pointing the right one or 'integrate-atom-pub' [2]is the one to look at?),
Yes the code we have is in branch integrate-atompub-client in http://hg.nuxeo.org/sandbox/chemistry/-- the old repo used before switching to Apache svn.
But as it happens I'm studying this code right now to adapt it tothe newest Chemistry API refactorings, and I'll commit code in svnbefore tonight, although it may be nonfunctional and not very unittested at all :( This code was written in a hurry by Bogdan for acustomer (although we have the IP on it) and is not up to thestandards I expected of it, so don't hesitate to criticize it anddiscuss refactorings.
so I was finally wondering:
1__ What's the state of the art of the AtomPub Java client impl?What the dev's opinion on the usage of Abdera? Is that already beendiscussed and I missed it? :)
No real discussion in these lists.
After having worked with Abdera for the server part, I've come tothe conclusion that it's a big library, rewrapping a lot of Axiom.Also it's still very young, and not well designed for extensibilityif you stray from the simple "one feed with entries in it" model.Bogdan, for the client part, decided to not use Abdera because oneof his goals was to allow it to be a small embedded library, so StAXwas all that was really needed. Abdera apparently is creating lotsand lots of objects and use lots of memory, when a simple StAX-basedparser gave him huge performance boosts.
There are a couple of reasons why I ask you guys suggestions/clarifications on this topic:- Adbera is the standard Apache Atom implementation and we can relyon a good cooperation between Apache projects
Agreed, however note that working with SNAPSHOTs of other projectsis a headache in terms of release. So if we start modifying Abderathen we'll have to think about how to release.
- In terms of maintenance overhead, I see good improvements ifAbdera is used both in the server (IIUC) and client part
Do you see any factoring between client and server beyond the Abderaextensions, beyond the few ElementWrapper subclasses?
Note also that I have already started using Abdera'sExtensibleElementWrapper in chemistry-atompub, however I don'tregister them as an Abdera extension (I instantiate directly)because Abdera extensions are global and I don't want to step on thetoes of any other code that would like to work with Chemistry butalready uses its own Abdera extension (like Alfresco). chemistry-atompub only has the methods useful for the server though, not yetthe client.
- In terms of dependencies explosion, I don't see a big deal in theAbdera (client) chain of (runtime) dependencies, especially if youconsider that the (Java) client is going to be most likely to beused for Java based Content Repositories (or custom applications)integration and these are typically library-flooded applicationsanyways.
I can't disagree with the fact that projects usually already uselots of libraries, so what's one more. Note however that Abdera ishuge, abdera-core + abdera-i18n + abdera-parser are already at 900Kb (Mostly due to Unicode data in abdera-i18n by the way).
- Choosing for Abdera, may enable me to contribute the alreadyfunctional Abdera extension of Alfresco, so to give quite of a jumpstart on the TCK/Client side
That's a good point.
BTW we also have an Abdera extension in yet another (older) CMISsandbox (http://hg.nuxeo.org/sandbox/nuxeo-cmis/file/tip/src/main/java/org/apache/abdera/ext/cmis/) which could be used as well. If you contribute yours, I'll look atmerging useful things we may have into it (although Abderaextensions are in fact rather simple).
- The usage of Abdera seems to be an enabler for contributionsalready built on top of it (see Sourcesense CMIS portlet [3])
2__ Do you think the Abdera extension could be a validcontribution? And in such a chase, would it belong to Chemistry orAbdera itself?
I would leave it in Chemistry until we consider it mature enough tobe moved to Abdera -- barring any dependency problems. This waywe'll get much more rapid turnaround in its update. It could move toAbdera once CMIS 1.0 is released, for instance.
As I'm not sure what the status of Florent implementation andparticularly I don't want to waste any effort already done, andthis is actually my first interaction with the list,
so please forgive me if I'm missing some blatantly obvious point ;)
No problem, these are all worthwhile points.
My next steps are to study Bogdan's client code, and if I (or thelist) feel its inadequate the I'll scrap it to go back to a simpleAbdera-based implementation. I'll commit something tonight in svn sothat others can look at it.
Florent

--
Florent Guillaume, Head of R&D, Nuxeo
Open Source, Java EE based, Enterprise Content Management (ECM)
http://www.nuxeo.com   http://www.nuxeo.org   +33 1 40 33 79 87

Re: Java Client/Adbera (was: Re: CMIS Client API)

Reply via email to