Walter, I hate spam to email lists, so I apologize if this message is not what you are looking for. That being said, my company Xythos Software Inc has a large investment in being a back end for many e-learning projects (including the BlackBoard Content Management System). That being said we have a number of institutions that are larger than the numbers you suggest here and include scalability, clustering, point-in-time recovery, replication, etc.
If this is something you would like to discuss in more detail, please let me know. Otherwise I am sorry for the unwarranted spam. Kevin Wiggen CTO Xythos Software Inc -----Original Message----- From: Walter Raboch [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 06, 2005 1:15 PM To: [email protected] Subject: Scalability/Clustering Hi all, we just plan to use JackRabbit in an e-learning project with a few hundred concurrent users. Therefore I am a little concerned about scalability. Some figures we forecast for the first expansion stage: 1.000.000 Nodes 10.000.000 Properties (around 10 properties/node) 3.000 Named Users (about 10% concurrent) We think of a n-tier architecture with a web and application layer, a repository layer and the database layer with 2 or more nodes for each layer. There are either Java and .net applications accessing the content in the repository, so we are planing to implement a .net client for JSR170 too. What would be the best deployment model for such a situation in your opinion? Are there any efforts to make jackrabbit clustered for a load sharing scenario (no session failover at repository layer) ? After reading a lot of code, I think following changes should do it: - extending ObservationManager to send and receive Events to and from other nodes - implementing/extending an ORM Layer (Hibernate with shared caching for performance). The persistence implementation should be aware of the node types and allow a type specific mapping to tables. So we can map nodetypes with many instances to own tables while maintaining flexibility for new "simple" nodetypes. - extending LockManager to sync locks with other Nodes - Lucene should be indepentend on each node but be aware of new nodes and changes -> Events from ObservationManager - Config - the cluster should have a central place for config management - some intelligence in the JCR-RMI client to find a content repository node from the cluster dependending on node state (load, shutdown, ...) What else should be synchronized between the nodes? Did I overlook something? I am happy about any suggestions even if you dicourage us from using jackrabbit. Of course we would release some of these developments to the community - if someone is interested. thx in advance, cheers Walter
