Dear Jackrabbit devs, we are considering Jackrabbit for a bigger CMS project (about 3 million documents, up to 150 concurrent editing users, lots of queries, transactions), Cocoon-based application. As I understand it, that would certainly require a scalable repository (has to be decided).
Now, a news message [1] on TheServerSide about benchmarks provided by Alfresco to prove the superiority of their JCR implementation raises some concerns. Since the benchmarks are (going to be) open source, is someone interested in running them on Jackrabbit? A post in the thread claims that Jackrabbit isn't suited for large-scale scenarios and faces some problems in the transactional handling of some 100.000 nodes (Kev Smith, [2]): "From what we've seen, Alfresco is comparable to JackRabbit for small case scenarios - but Alfresco is much more scalable [...]" Do you agree to this statement? If yes - are these problems related to the persistence manager abstraction? Is this a known issue, and will it be addressed? Another paragraph from this post: "We tried to load up JackRabbit with millions of nodes but always ran into blocker issues after about 2 million or so objects. Also when loading up JackRabbit, the load needed to be carefully performed in small chunks e.g. trying to load in 100,000 nodes at a time would cause PermGenSpace errors (even with a HUGE permgenspace!) and potentially place the repo into a non-recoverable state." I'm not sure if this will really be an issue for our usage scenario (except maybe from restoring backups), but I'm very interested in your opinions. Thanks a lot in advance! [1] http://www.theserverside.com/news/thread.tss?thread_id=43282 [2] http://www.theserverside.com/news/thread.tss?thread_id=43282#223061 -- Andreas