Re: Jackrabbit Scalability / Performance

2007-04-27 Thread Viraf Bankwalla
Thanks, this is great news.  Is there any additional information that you could 
share about your implementation.  What was the deployment environment, what 
model did you use for persistence, how did you handle backups, etc.

Did you consider Alfresco or other JCR solutions?  What did you see as the 
pro's and cons.

Thanks.

- viraf



David Nuescheler [EMAIL PROTECTED] wrote: hi viraf,

thanks for your mail.

Has anyone built an application similar to that described above?
 What version of Jackrabbit was used, and what were the issues that you ran 
 into.
 How much meta-data did a node carry, what was the average depth of a leaf
 node, and how many nodes did you have in the implementation before
 performance became an issue.
we built a digital asset management application that sounds very
similar to what you are describing. the meta information varies from
filetype to filetype but ranges on average between 10 and 50 properties
per nt:resource instance. in addition to typical meta information
we also store a number of thumbnail images in the content repository
for every asset.

I am considering on building a cluster of servers providing repository
 services. Can the repository be clustered ? (a load balancer in front of the
 repository will distribute requests to a pool of repository servers.).
yes, jackrabbit can be clustered. i would recommend though to run the
repository with model 1 or model 2 [1] and just use the load balancer
on top of your application. this avoids the overhead of remoting all
together and still provides you with clustering.

[1] http://jackrabbit.apache.org/doc/deploy.html

How does the repository scale? can it handle  50Million artifacts
 (if the artifacts are placed on the file system does Alfresco manage
 the directory structure or are all files placed in a single directory)
assuming that you mean jackrabbit... ;)
we ran tests beyond 50m files and yes jackrabbit manages the filesystem
if the filesystem is chosen as the persistence layer for blobs.

Is there support for auditing access to documents ?
this could easily be achieved with a decoration layer.

Is there support for defining archival / retention policies?
jackrabbit certainly offers the hooks to build recordsmanagment
but does not come with ootb archival or retention facilties.

Is there support for backups ?
for the most convenient backup i would recommend to persist the entire
content repository in an rdbms and use the rdbms features for backup.

regards,
david


   
-
Ahhh...imagining that irresistible new car smell?
 Check outnew cars at Yahoo! Autos.

Jackrabbit Scalability / Performance

2007-04-26 Thread Viraf Bankwalla
Hi, 
  
 I am working on an application in which documents arriving at a mail-room are 
scanned and placed in a content repository. I need basic functionality to add, 
locate and retrieve artifacts to the repository. JSR-170 provides these basic 
services, however the interface looks chatty.  To address chattyness and 
performance issues associated with large documents, I am planning on exposing 
coarse grained business services (which use the local JCR interface).  Given 
that these are scanned images, I do not need the document to be indexed.  I do 
however need to be able to search on the metadata associated with a document.  
I was wondering if:


   Has anyone built an application similar to that described above?  What 
version of Jackrabbit was used, and what were the issues that you ran into.  
How much meta-data did a node carry, what was the average depth of a leaf node, 
and how many nodes did you have in the implementation before performance became 
an issue.
   I am considering on building a cluster of servers providing repository 
services. Can the repository be clustered ? (a load balancer in front of the 
repository will distribute requests to a pool of repository servers.).
   How does the repository scale? can it handle  50Million artifacts (if the 
artifacts are placed on the file system does Alfresco manage the directory 
structure or are all files placed in a single directory)
   Is there support for auditing access to documents ?
   Is there support for defining archival / retention policies? 

   Is there support for backups ?

  
 Thanks.

- viraf

   
-
Ahhh...imagining that irresistible new car smell?
 Check outnew cars at Yahoo! Autos.