CMIS Implementation Experiences

Florian Müller Tue, 15 Dec 2009 07:38:59 -0800

Hi all,

I would like to foster the technical discussion between the Chemistry team and 
the people behind the OpenCMIS proposal. If you think this is inappropriate on 
this list, please let me know.


In order to explain the rationale behind the OpenCMIS design I would like to 
talk about some of the experiences that we made with CMIS client and server 
implementations.

We also started with Abdera on the server side. It turned out to be more pain 
than joy. With a pure JAXB design we ran into compatibility issues. A good 
tradeoff between efficiency, correctness and maintainability seems to be StAX 
with JAXB. OpenCMIS handles all AtomPub related tags with StAX and all CMIS 
related data with JAXB. The JAXB objects are not exposed to the application. 
They are just interim objects.  
The same StAX/JAXB design should work on the server side as well. The effort to 
implement AtomPub is manageable. I've done this in my CMIS FileShare project.

Another detail we learned is that implementing both bindings in parallel saves 
you a lot of refactoring later. Both CMIS bindings are really different. If you 
align your classes and flows to just one binding you might have to refactor a 
lot later  to make the other binding work smoothly. This insight is reflected 
in OpenCMIS in two areas. First of all, there is a strict decoupling of the 
binding implementation (Provider layer) and the nicer Java API (Client layer). 
If somebody would show up with a third CMIS binding we just have to touch the 
Provider layer. The second area is within the Provider layer. We tried to reuse 
as much code and concepts as possible between both binding implementations. For 
example, both binding implementations share the generated JAXB classes, the 
caching infrastructure and several utilities. 

We introduced type (and repository info) caching based on our experiences with 
applications using a CMIS library. Applications need type information all over 
the place and it is expensive to fetch them over and over again. From a library 
perspective one can argue that caching should be done a level above the 
library. From practical standpoint it would be nice if it is done once and 
right. So we decided to put it into OpenCMIS. If an application doesn't want 
it, it can switch it off. The caching works implicitly. Whenever a type 
definitions runs through the library the data is cached or refreshed.
CMIS provides no mechanism to detect type changes. So there is a slight chance 
that the type cache holds outdated data. In an enterprise scenario (and that's 
what OpenCMIS is aiming at) type changes shouldn't happen often. They are 
usually interconnected with an update or re-deployment of the application. A 
paranoid application developer can switch off the cache (and accept the 
performance penalty) or clear the cache regularly (every hour or every five 
minutes or every 30 seconds...) or create a new session once a while. Since 
sessions are bound to logins there is a regular exchange of sessions and 
therewith caches, anyway. 

Another aspect that we think is important are extensions. CMIS defines a lot of 
extension points and repositories will make use of it sooner or later. 
Application should be able to access and set extension data. Sure, it is 
against the idea of a standard but it will happen and the library should be 
prepared for that. The difficult part here is to make the binding invisible to 
the application since some extension points are very binding specific. Using 
JAXB in both bindings covers a lot but not everything. OpenCMIS has the 
infrastructure in place but is not perfect in this regard, yet.


I hope that's the beginning of a fruitful conversation,

Florian

CMIS Implementation Experiences

Reply via email to