Re: Scalability

David Nuescheler Fri, 13 Apr 2007 01:27:01 -0700

Hi,

The good news first ;) :
Jackrabbit is designed to cluster a number of nodes backed by
a single RDBMS.
Please find more information on how to configure this here:
http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200611.mbox/[EMAIL 
PROTECTED]


I would also like to comment on your observations:

(1)  How the data is stored in the Database largely depends
on the persistence manager used. The
BundlePersistenceManagers (which are the ones that that I
would recommend for a bigger DB backed installation), store
the representation of a Node and its properties in a compressed
binary format in the database.

(2) To satisfy the requirements of a content repository as specified
by JCR, I think it is not possible to use just the database index
anyway. In particular for features like inheritance, fulltext or searching
unstructured information in a fine grained fashion.
This is why Jackrabbit (just like any other repository implementation
that I am aware of) keeps an additional index.
This additional index is synched through clustering and
does not need to be backed-up, since it can be rebuilt from
the information in the database in a recovery scenario.
So a Jackrabbit instance can be cloned or restored entirely
by just restoring the Database and supplying the repository.xml.

regards,
david

On 4/13/07, FolDeRol <[EMAIL PROTECTED]> wrote:

Dear team,

Could anybody clarify me the situation with Jackrabbit's scalability?

We are considering Jackrabbit as a back-end for a large application with
high level of data flow in a clustered environment. When I started the
evaluation of Jackrabbit having read that it could employ an RDBMS as a
persistance layer, I though that we could set up a number of cluster nodes
using Model 2 of deployment which would use the same logical instance
(probably clustered) of the database and thus be scalable. I could not find
any details on this, and decided to learn the database schema and trace JDBC
calls so to estimate the performance.

What was my wonder when I had known the truth. The data is stored in the
RDBMS as a serialized Java objects and query operations are not handled by
the RDBMS at all but rather directly by the Jackrabbit engine on indices
stored on the file system. Now, I'm seriously alarmed that Jackrabbit might
be inappropriate solution for our goal.

Please someone confirm or deny my assumptions.

Regards

Re: Scalability

Reply via email to