Jackrabbit & cluster with Oracle backend

Fabrice Aupert Fri, 16 May 2014 11:01:47 -0700

Hi,

We're building a 'document manager' for an existing J2EE (java5/Websphere
6.1) webapp deployed in a cluster. This manager has to be fully integrated
into the webap. Due to production constraints, storing data in a shared
filesystem is not an option. All data/metadata must be stored in an Oracle
10g DB.


We have a working prototype based on Jackrabbit 2.2.13. On each node of the
cluster, the webapp embeds jackrabbit JAR and owns a dedicated repository
directory on the local filesystem. This repo contains a repository.xml file
which is pretty much the same on all nodes except for <Cluster id=""> (see
attached file). Once the webapp started, the local 'repository' directory
contains only a few files, index essentially. Example :

./repository

./repository/repository.xml

./repository/workspaces

./repository/workspaces/security

./repository/workspaces/security/workspace.xml

./repository/workspaces/security/index

./repository/workspaces/security/index/indexes_2

./repository/workspaces/security/index/_0

./repository/workspaces/security/index/_0/cache.inSegmentParents

./repository/workspaces/security/index/_0/segments_1

./repository/workspaces/security/index/_0/segments.gen

./repository/workspaces/security/index/_0/segments_2

./repository/workspaces/security/index/_0/_0.cfs

./repository/workspaces/myrepo

./repository/workspaces/myrepo/workspace.xml

./repository/workspaces/myrepo/index

./repository/workspaces/myrepo/index/indexes_2

./repository/workspaces/myrepo/index/_0

./repository/workspaces/myrepo/index/_0/_2.cfs

./repository/workspaces/myrepo/index/_0/segments_4

./repository/workspaces/myrepo/index/_0/cache.inSegmentParents

./repository/workspaces/myrepo/index/_0/segments_1

./repository/workspaces/myrepo/index/_0/segments.gen

./repository/revision.log

>From what we've seen, a thread is started on each node by jackrabbit to
refresh indexes periodically, allowing synchronization inside the cluster.

This architectural layout seems to work but, as we lack any real world
experience with jackrabbit in this context, we would like to check with the
community that we're not bending jackrabbit capabilities in the wrong
direction. Could it lead to silent data corruption/inconsistencies ?

The second point is about giving operations decent tooling to manage the
jackrabbit repo :

- Admin console : we were thinking, as our embedded jackrabbit does not
expose RMI or Webdav interface, relying on jackrabbit-standalone (either
cli or server mode) : by copying a repository.xml, changing its cluster id,
and starting a new session with the standalone version from this file, we
could manage our nodes and to search (using jackrabbitexplorer on top of it
for example). Could it be a viable solution ?

- Repo inconsistencies : does the OraclePersistenceManager really support
the <param name="consistencyFix" value="true" /> ? It does not seem so. Are
there other tools we could use to investigate and fix problems inside repo
data ?

Any input on this matter would be extremely valuable to us.

Thanks.

Fabrice Aupert

Jackrabbit & cluster with Oracle backend

Reply via email to