On Tue, Aug 7, 2012 at 4:07 PM, Jukka Zitting <[email protected]> wrote: > Hi, > > [Just throwing an idea around, no active plans for further work on this.] > > One of the biggest performance bottlenecks with current repository > implementations is disk speed, especially seek times but also raw data > transfer rate in many cases. To work around those limitations we've in > Jackrabbit used various caching strategies that considerably > complicate the codebase and still have trouble with cache misses and > write-through performance. > > As an alternative to such designs, I was thinking of a microkernel > implementation that would keep the *entire* tree structure in memory, > i.e. only use the disk or another backend for binaries and possibly > for periodic backup dumps. Fault tolerance against hardware failures > or other restarts would be achieved by requiring a clustered > deployment where all content is kept as copies on at least three > separate physical servers. Redis (http://redis.io/) is a good example > of the potential performance gains of such a design. > > To estimate how much memory such a model would need, I looked at the > average bundle size of a vanilla CQ5 installation. There the average > bundle (i.e. a node with all its properties and child node references) > size is just 251 bytes. Even assuming larger bundles and some level of > storage and index overhead it seems safe to assume up to about 1kB of > memory per node on average. That would allow one to store some 1M > nodes in each 1GB of memory. > > Assuming that all content is evenly spread across the cluster in a way > that puts copies of each individual bundle on at least three different > cluster nodes and that each cluster node additionally keeps a large > cache of most frequently accessed content, a large repository with > 100+M content nodes could easily run on a twelve-node cluster where > each cluster node has 32GB RAM, a reasonable size for a modern server > (also available from EC2 as m2.2xlarge). A mid-size repository with > 10+M content nodes could run on a three- or four-node cluster with > just 16GB RAM per cluster node (or m2.xlarge in EC2). > > I believe such a microkernel could set a pretty high bar on > performance! The only major performance limit I foresee is the network > overhead when writing (need to send updates to other cluster nodes) > and during cache misses (need to retrieve data from other nodes), but > the cache misses would only start affecting repositories that go > beyond what fits in memory on a single server (i.e. the mid-size > repository described above wouldn't yet be hit by that limit) and the > write overhead could be amortized by allowing the nodes to temporarily > diverge until they have a chance to sync up again in the background > (as allowed by the MK contract).
I've thinking about similar in memory strategies ;-) Especially with leveraging one of the existing in-memory data grid solutions (hazelcast, infinispan, etc) which can take care of most of the clustering details and the long term/backup/disk storage. I think this is also what ModeShape 3.0 is doing with infinispan[1]. But I guess that would require a different architecture than is now implemented in oak which has a more of a distributed (git like) structure with options to scale (out) with sharding (I actually might be wrong on this point since I'm not (yet) that familiar with the architecture). Regards, Bart [1] http://planet.jboss.org/post/modeshape_3_0_alpha1_is_here_and_it_rocks
