Hi, On Mon, Dec 8, 2008 at 6:42 PM, Torsten Curdt <[EMAIL PROTECTED]> wrote: > * Webdav and Versioning > > Jackrabbit comes with (IIUC even multiple) webdav implementations. The > 1.5 standalone jar even starts up one. It exposes all the node > information. Is there already somewhere a webdav servlet that gives > more of a "user" view of the data? Where it does NOT show all the node > information but just the data nodes (files/directories). Similar to > the "browse" in the standalone jar.
The servlet you're looking for is org.apache.jackrabbit.j2ee.SimpleWebdavServlet. > Maybe even a webdav servlet that transparently versions changes? It doesn't do versioning transparently, but it does support the WebDAV versioning features. > * Repository Browser > > While for webdav it would be nice to show less, it would be nice to > show more on the 'browse' of the standalone jar. In fact switching the > amount of information of both (webdav/browse) would be great. I know > other 3rd parties have sophisticated browsers for JCR. But is there > one that comes with jackrabbit that I've missed? What do people use? No. We planned to have a content browser included already in 1.5.0 (see JCR-1455), but in the end that unfortunately didn't happen. At Day we have the commercial CRX Content Explorer that we're planning to contribute to Jackrabbit, but that effort is a bit stalled due to technical and legal issues. There are also a few good open source browsers around, I've personally used and liked the JCR Explorer available at http://www.jcr-explorer.org/. > * Scaling Out and SOA > > I am wondering what the suggested architecture would look like for > jackrabbit in a bigger installation. The classic setup would be a > couple of front end machines rendering the content that comes out of a > bigger database or a database cluster. Question is how to translate > this into a jackrabbit setup. As you noticed, the recommended approach for now would be to use a Jackrabbit cluster with each cluster node running locally on each front end server (and in the same JVM process as your application). This is mostly due to current performance limitations of the JCR-RMI layer. There are no architectural reasons why the performance of remote JCR access couldn't be similar (or even notably better due to the cache-friendly design of JCR) to that of many relational databases, but so far not much work has been done to optimize remote access performance as the common deployment model has been to have the repository running locally within the application or the application server. Remote API access has mostly been used for administrative purposes where performance is not that critical. In fact one of our reasons for introducing the new standalone server jar is to raise the awareness about this performance issue and to perhaps get some contributions to improve it. :-) > Especially as RMI is hinted to be slow and also syncing the replay > logs across the cluster is a bit of an overhead I would grateful for > some more details and advise here. See above. The main reason for the current slow performance is that the JCR-RMI layer was originally designed to map most JCR API calls one-to-one to equivalent remote method calls with no caching or batching features. This approach worked great in that we were able to support almost the entire range of JCR functionality quite easily, but it does come with quite severe performance limitations as for example each individual Node.getProperty() call causes a network roundtrip instead of being executed against a locally cached copy of the node. > * Searching in a Cluster > > Assuming I have a jackrabbit cluster - how is the index generation > handled? Will every jackrabbit instance have it's own index and also > be the one that keeps the local index up-to-date? Yes, each node keeps their own indexes. > Does the index get synchronized through the jackrabbit cluster > mechanism? Yes. The cluster nodes listen for changes recorded in the cluster journal, and update the indexes based on the observed updates. > * Searching and Versioning > > When I search and I have versioned resources. Will it search all > versions? ...or only the latest one? How is this handled? The version histories of all versionable nodes are available in the /jcr:system/jcr:versionStorage subtree. You can search for all past versions in that subtree, or for the checked out versions in normal workspace storage outside /jcr:system. > I heard about InfoQ using jackrabbit. Could not find exact details > about their infrastructure though. Have you seen http://www.infoq.com/presentations/design-and-architecture-of-infoq ? I guess that's the best introduction there is to how they're set up. > Someone else using it in a bigger installations? At Day we use Jackrabbit as the core of all our current products. We do have some performance and scalability features that go beyond what's there in Jackrabbit, but most of the customer cases you can find on our web site are based on the clustering and bundle persistence features that have been also in Jackrabbit already for some while. BR, Jukka Zitting
