Hi, On Mon, Dec 8, 2008 at 11:24 PM, Torsten Curdt <[EMAIL PROTECTED]> wrote: >>> Maybe even a webdav servlet that transparently versions changes? >> >> It doesn't do versioning transparently, but it does support the WebDAV >> versioning features. > > Hm ...so how would that work if you use the standard OSX/Windows > client and you just mount the repository. > Would it version the files or not?
No, you'll need a versioning-aware client to explicitly invoke the CHECKIN/CHECKOUT methods. > I remember there is such autoversioning option to > mod_dav (SVNAutoversioning) We don't support that out of the box, but if you need that functionality it should be reasonably straightforward to implement it by subclassing the WebDAV servlet and adding the extra versioning calls around normal write operations. >> There are also a few good open source >> browsers around, I've personally used and liked the JCR Explorer >> available at http://www.jcr-explorer.org/. > > That one looks indeed quite good. It's ASL 2.0 - why not include that > if there are problems with the CRX one? > IMO it would be a big step forward to have something like that out of the box. Yeah, I guess we should do that. > (Still congrats on the standalone jar ... that is pretty sweet!) Thanks. :-) >> As you noticed, the recommended approach for now would be to use a >> Jackrabbit cluster with each cluster node running locally on each >> front end server (and in the same JVM process as your application). > > OK ... what about the persistence part? I know CRX has the mighty Tar > PM :) ...but what about scaling at this end? Has this ever been a > problem? If you have a cluster of 5-10 machines and just a single > database for persistence I would imagine this could potentially become > a bottleneck. Anyone ever used a whole database cluster for > persistence? You'll typically want a clustered database as the backend storage for best fault-tolerance and scalability. We've used such setups quite often and it works great. > Any suggestions there? I might "have to" use an Oracle. Most of the customer projects I've done already have a "company standard" database backend, so typically you use the database that's already there. The way Jackrabbit stores content below the persistence manager layer is quite simple (we don't even need JOINs!), so any modern database will probably do just fine. Take whatever you are most comfortable with. Note that for repositories with lots of large binaries I would suggest using the data store feature based on a shared disk (NAS or SAN), as that will decouple all costly binary accesses from the database. > My first though was: shouldn't the JCR server just have a REST API? > ...and then thought of Sling. And CouchDB. Or probably much more > FeatherDB > (http://fourspaces.com/blog/2008/4/11/FeatherDB_Java_JSON_Document_database) > > How this fits the picture might probably more something for the dev list. Yeah, it's still an area of development. If you're interested, you may want to check out the spi2dav effort in the Jackrabbit sandbox where we're building a remoting mechanism for the full JCR API based on the WebDAV protocol. See http://jackrabbit.apache.org/JCR_Webdav_Protocol.doc for an earlier draft of the protocol details. > I was actually surprised about the choice of RMI anyway. > (Forgive my words - but it's a bitch of a protocol) The rationale for going with RMI originally was to get something reasonably complete done quickly and easily. That approach actually worked much better than I had originally hoped and we were able to cover almost the entire JCR API (that's not too small) with relatively little effort. In that sense I'm pretty happy with our use of RMI, but of course that simplicity comes with limitations. >>> Does the index get synchronized through the jackrabbit cluster >>> mechanism? >> >> Yes. The cluster nodes listen for changes recorded in the cluster >> journal, and update the indexes based on the observed updates. > > Incrementally? Are there any guarantees for the observation? I just > imagine a node to go down, miss an update and be out of sync when it > comes back up. Something you really don't want to have in a cluster. The journal keeps the update records until all cluster nodes have seen them (see JCR-1087), so you'll never miss updates. >> The version histories of all versionable nodes are available in the >> /jcr:system/jcr:versionStorage subtree. You can search for all past >> versions in that subtree, or for the checked out versions in normal >> workspace storage outside /jcr:system. > > So the index includes and references all versions? Yes. > "bundle persistence features"? WDYM? See JCR-755, introduced in Jackrabbit 1.3. "Bundle persistence" is currently the recommended and default persistence mechanism in Jackrabbit. It essentially stores each node as a "bundle" that contains all the properties and child node references associated with that node. Previously we used separate records for all nodes *and* properties, but that turned out to cause way too many calls to the backend database or file system. The bundle approach seems to be right level of granularity for JCR (though you may want to look up the NGP discussions on dev@ about potential alternatives) and it's worked pretty well so far. BR, Jukka Zitting
