workspace / repository scalability

Cris Daniluk Tue, 22 May 2007 06:49:04 -0700

Hello,

I've been considering JackRabbit as a potential replacement for a
traditional RDBMS content repository we've been using. Currently, the
metadata is very "static" from document to document. However, we want to
start bringing in arbitrary types of documents. Each document will specify
its own metadata, and may map "core" metadata back to a set of common
fields. It really seems like a natural fit for JCR.


I don't really need search (search services will be provided by a separately
synchronized and already existing index), but I do need content scalability.
We have about 500GB worth of binary data and 1GB of associated text metadata
right now (about 200k records). Ideally, the repository would contain the
binary data as the primary node, rather than merely referencing it. However,
this already large data set will probably grow up to 2-3TB in the next year
and potentially way beyond that, with millions of records.

From browsing the archives, it seems like this would be well above and

beyond the typical repository size. Has anybody used Jackrabbit with this
volume of data? It is pretty difficult to set up a test, so I'm left to rely
on similar user experience. Would clustering, workspace partitioning, etc
handle the volume we'd be expected to produce?

Thanks for the help,

Cris

workspace / repository scalability

Reply via email to