Re: my first hops

Jukka Zitting Mon, 08 Dec 2008 12:38:35 -0800

Hi,

On Mon, Dec 8, 2008 at 6:42 PM, Torsten Curdt <[EMAIL PROTECTED]> wrote:
> * Webdav and Versioning
>
> Jackrabbit comes with (IIUC even multiple) webdav implementations. The
> 1.5 standalone jar even starts up one. It exposes all the node
> information. Is there already somewhere a webdav servlet that gives
> more of a "user" view of the data? Where it does NOT show all the node
> information but just the data nodes (files/directories). Similar to
> the "browse" in the standalone jar.


The servlet you're looking for is
org.apache.jackrabbit.j2ee.SimpleWebdavServlet.

> Maybe even a webdav servlet that transparently versions changes?

It doesn't do versioning transparently, but it does support the WebDAV
versioning features.

> * Repository Browser
>
> While for webdav it would be nice to show less, it would be nice to
> show more on the 'browse' of the standalone jar. In fact switching the
> amount of information of both (webdav/browse) would be great. I know
> other 3rd parties have sophisticated browsers for JCR. But is there
> one that comes with jackrabbit that I've missed? What do people use?

No. We planned to have a content browser included already in 1.5.0
(see JCR-1455), but in the end that unfortunately didn't happen. At
Day we have the commercial CRX Content Explorer that we're planning to
contribute to Jackrabbit, but that effort is a bit stalled due to
technical and legal issues. There are also a few good open source
browsers around, I've personally used and liked the JCR Explorer
available at http://www.jcr-explorer.org/.

> * Scaling Out and SOA
>
> I am wondering what the suggested architecture would look like for
> jackrabbit in a bigger installation. The classic setup would be a
> couple of front end machines rendering the content that comes out of a
> bigger database or a database cluster. Question is how to translate
> this into a jackrabbit setup.

As you noticed, the recommended approach for now would be to use a
Jackrabbit cluster with each cluster node running locally on each
front end server (and in the same JVM process as your application).

This is mostly due to current performance limitations of the JCR-RMI
layer. There are no architectural reasons why the performance of
remote JCR access couldn't be similar (or even notably better due to
the cache-friendly design of JCR) to that of many relational
databases, but so far not much work has been done to optimize remote
access performance as the common deployment model has been to have the
repository running locally within the application or the application
server. Remote API access has mostly been used for administrative
purposes where performance is not that critical.

In fact one of our reasons for introducing the new standalone server
jar is to raise the awareness about this performance issue and to
perhaps get some contributions to improve it. :-)

> Especially as RMI is hinted to be slow and also syncing the replay
> logs across the cluster is a bit of an overhead I would grateful for
> some more details and advise here.

See above. The main reason for the current slow performance is that
the JCR-RMI layer was originally designed to map most JCR API calls
one-to-one to equivalent remote method calls with no caching or
batching features. This approach worked great in that we were able to
support almost the entire range of JCR functionality quite easily, but
it does come with quite severe performance limitations as for example
each individual Node.getProperty() call causes a network roundtrip
instead of being executed against a locally cached copy of the node.

> * Searching in a Cluster
>
> Assuming I have a jackrabbit cluster - how is the index generation
> handled? Will every jackrabbit instance have it's own index and also
> be the one that keeps the local index up-to-date?

Yes, each node keeps their own indexes.

> Does the index get synchronized through the jackrabbit cluster
> mechanism?

Yes. The cluster nodes listen for changes recorded in the cluster
journal, and update the indexes based on the observed updates.

> * Searching and Versioning
>
> When I search and I have versioned resources. Will it search all
> versions? ...or only the latest one? How is this handled?

The version histories of all versionable nodes are available in the
/jcr:system/jcr:versionStorage subtree. You can search for all past
versions in that subtree, or for the checked out versions in normal
workspace storage outside /jcr:system.

> I heard about InfoQ using jackrabbit. Could not find exact details
> about their infrastructure though.

Have you seen 
http://www.infoq.com/presentations/design-and-architecture-of-infoq
? I guess that's the best introduction there is to how they're set up.

> Someone else using it in a bigger installations?

At Day we use Jackrabbit as the core of all our current products. We
do have some performance and scalability features that go beyond
what's there in Jackrabbit, but most of the customer cases you can
find on our web site are based on the clustering and bundle
persistence features that have been also in Jackrabbit already for
some while.

BR,

Jukka Zitting

Re: my first hops

Reply via email to