Hi, The good news first ;) : Jackrabbit is designed to cluster a number of nodes backed by a single RDBMS. Please find more information on how to configure this here: http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200611.mbox/[EMAIL PROTECTED]
I would also like to comment on your observations: (1) How the data is stored in the Database largely depends on the persistence manager used. The BundlePersistenceManagers (which are the ones that that I would recommend for a bigger DB backed installation), store the representation of a Node and its properties in a compressed binary format in the database. (2) To satisfy the requirements of a content repository as specified by JCR, I think it is not possible to use just the database index anyway. In particular for features like inheritance, fulltext or searching unstructured information in a fine grained fashion. This is why Jackrabbit (just like any other repository implementation that I am aware of) keeps an additional index. This additional index is synched through clustering and does not need to be backed-up, since it can be rebuilt from the information in the database in a recovery scenario. So a Jackrabbit instance can be cloned or restored entirely by just restoring the Database and supplying the repository.xml. regards, david On 4/13/07, FolDeRol <[EMAIL PROTECTED]> wrote:
Dear team, Could anybody clarify me the situation with Jackrabbit's scalability? We are considering Jackrabbit as a back-end for a large application with high level of data flow in a clustered environment. When I started the evaluation of Jackrabbit having read that it could employ an RDBMS as a persistance layer, I though that we could set up a number of cluster nodes using Model 2 of deployment which would use the same logical instance (probably clustered) of the database and thus be scalable. I could not find any details on this, and decided to learn the database schema and trace JDBC calls so to estimate the performance. What was my wonder when I had known the truth. The data is stored in the RDBMS as a serialized Java objects and query operations are not handled by the RDBMS at all but rather directly by the Jackrabbit engine on indices stored on the file system. Now, I'm seriously alarmed that Jackrabbit might be inappropriate solution for our goal. Please someone confirm or deny my assumptions. Regards
