Hi Miro et al, Thanks for the detailed insight. To pick up on some key points:
* versioning - our current model makes use of nt:hierarchyNode, mix:referenceable, mix:lockable and mix:versionable. From your comments it sounds like using mix:versionable will significantly reduce the reliability of JackRabbit. Would you recommend NOT using mix:versionable therefore? * persistence - we'd prefer to use the SimpleDbPersistenceManager with MySql. Is this a popular/reliable combination? * Fix "by hand" - Given that some persistence managers use binary serialization, how do you go about correcting the integrity of the database? The prospect scares me but its not uncommon with applications operating ontop of schemas with complex referential integrity. * you mentioned Day CRX. We also installed this and we're initially impressed with the polished package however since then we've found some significant problems with the Content Explorer etc which sow the seed of doubt that there are potentially bigger issues under the covers. It's important for us to have a commercial alternative so I'd welcome any comments/experiences on using Day versus JackRabbit - for example, is mix:versionable viable with Day? Overall, your comments haven't 'put me off'. All persistence tiers have their problems as they mature - this doesn't negate the value-add JackRabbit provides over and above building a custom OR/RDBMS solution. I've happy to share our results with this list as we perform various tests. Regards, Shaun. -----Original Message----- From: Miro Walker [mailto:[EMAIL PROTECTED] Sent: 15 November 2006 08:47 To: [email protected] Subject: Re: JackRabbit maturity?- robustness, performance and scalability Hi Shaun, Our experience with production systems has largely been with Day's commercially licensed version of Jackrabbit, CRX, which contains some prioprietary extensions. However, it's sufficiently similar that many of the points you raise have similar answers across both systems. Our experiences to date have indicated that there isn't a straight answer to the questions you answer - much depends upon what you are trying to do with the system. For example: > * performance with lots of nodes - any comments on the best > persistence manager/config to use over and above the FAQ comments. Key factors here are: * your data model - Jackrabbit does not handle large flat node hierarchies well, so it is sometimes necessary to artificially deepen the hierarchy to address this. * the persistence manager - the way in which JR stores data in the underlying database has a big effect on performance (e.g. remote vs. local db, persistence manager mapping to database tables). * use of versioning / transactions - use of these features carries a performance overhead (in some cases significant). Reliability > * reliability of the persistence - how likely is corruption of the > persisted objects? Again this depends... Use of versionable nodes seems to be a problem at the moment. We've seen significant issues with data loss and corruption in live environments because of the current transaction handling when storing versionable nodes. This is to do with the fact that JR does not have support for true distributed transactions, but maintains seperate connections to the workspace and the version storage. If one of these fails and rolls-back you can end up with a corrupt repository that then needs to be fixed "by hand" with possible loss of data. There are other issues, such as current lack of failover support, search-indexes not being transactional (afaik still?), the need to restart jackrabbit in the event of transient loss of connectivity to the database, etc., but these are comparatively more minor. > * scalability - has JackRabbit being proven to handle lots of > concurrent access? Can it yet be clustered? Any equivalent to the > replication provided by Day? There's some work Dominique's doing now on clustering - see JCR-263 (http://issues.apache.org/jira/browse/JCR-623). In terms of concurrent simple read access, JR is pretty damned fast, so handling lots (how much are you thinking of here?) of concurrent access is unlikely to be a problem even without clustering support. For write access or versioning, etc. > > Any insight from developers with live systems based on JackRabbit > would be gratefully received and provide reassurance that JackRabbit > is a suitable choice. > Hope that's useful and hasn't put you off too much :-). Miro Send instant messages to your online friends http://uk.messenger.yahoo.com
