Re: Is JDBC persistence manager supported by jackrabbit?

Edgar Poce Wed, 31 Aug 2005 23:58:09 -0700

Hi vadim

Vadim Gritsenko wrote:

Edgar,


Was trying to find more information following your references, but...

[1] http://thread.gmane.org/gmane.comp.apache.jackrabbit.devel/1435



Points to JIRA which states [1]:

   Comment by Edgar Poce [12/Jul/05 06:00 AM]
   This kind of approach is discouraged by design

Can you please clarify your point?

There are a couple of conversations in the archive about this. My pointis that the PM contract is not suitable for mapping the itemstates intoa relational database with a table design that breaks the ItemState intoits constituent parts. The PM is intended to keep it simple, which meansto store the itemstate as a whole without interpreting the data. See thejdbc pm under contrib.

The main problem to store the itemstates in a complex schema is theCollection handling. Since Collection fields changes are not logged intoadd/update/remove aware objects, all the elements in the Collection mustbe stored on each write call. It causes a hit on performance whenhandling collections with lots of elements, even with the simple PMsincluded in the core.

see the second chart in http://issues.apache.org/jira/browse/JCR-188. Inmy PIV box with Object PM + cqfs, any write operation (e.g. set aproperty) takes up to half a sec when the given node reaches 3k children.If I tried to run the same test with the impl proposed in jcr-91, thehalf sec mark would be reached much sooner than with 3k children, just ahundred children would make the repo unbearably slow.


when I decided to write the jdbc pm proposed in jcr-91 I wanted:

1 - a mature, transactional and scalable persistence storage
2 - use rdbms administrative tools, like scheduled backups, etc.
3 - rdbms referential integrity
4 - avoid redundancy. PMs store the NodeReferences twice.
5 - a storage that allows to modify the data easily, just in case.

But in order to achieve the above goals the PM should interpret the data:(. Maybe we can bring this up again after the first release ...


> Or, may be point to the document /
> discussion regarding the design?
>

Even when it's not directly related you might want to take a look to theDominique's post about jackrabbit internals. Seehttp://article.gmane.org/gmane.comp.apache.jackrabbit.devel/1223

[2] http://wiki.apache.org/jackrabbit/PersistenceManagerFAQ
Points to Wiki page which does not clarify your POV either.

It's not my point of view. I just collected the devs opinions on thisissue from the mailing list. If it's not clear please trace theconversations in the archive and clarify it.


> It states though:


   The PM interface was never intended as being a general SPI that
   you could implement in order to integrate external datasources
   with proprietary formats (e.g. a customers database).

This raises the question, what is the recommended SPI to code against?

I think that the jcr-ext project under contrib might be a good startingpoint. Or, despite the PM is not intended to be a SPI, you can handle toplug your legacy data if you do it carefully.

PS Wiki page has incorrect statement:

    XML PersistenceManager
      * Write operations are synchronized
AFAICS, XML PM (unnecessarily) syncronizes all calls, including load()and exist() calls.

Why incorrect? maybe incomplete...

> Does it mean FileSystem interface considered to be

single threaded?

I don't think so

> Does not make much sense, though...

I agree. I think that the concurrency issue was handled first at theSHISM level, then it was moved to the PM, and then back to the SHISM(see http://issues.apache.org/jira/browse/JCR-164). Those synchronizedmodifiers seem to be there because the PM contract is not very clearyet, at least for me :(.


br,
edgar

Thanks,
Vadim

[1] http://issues.apache.org/jira/browse/JCR-91#action_12315534

Re: Is JDBC persistence manager supported by jackrabbit?

Reply via email to