Re: Scalability/Clustering

Walter Raboch Sat, 09 Jul 2005 08:24:31 -0700

Hi David, Hi Serge,

cool. i am currently trying to get at least a common .NET port
of the API put together in jackrabbit (just like markus did it for PHP)
are you interested in helping with that?

i think a .NET client using the WebDAV JCR remoting couldbe a very interesting option.

http://www.day.com/jsr170/server/JCR_Webdav_Protocol.zip


yes I am interested... is there still some code or how do we begin?


filesystem vs database:

I see the advantages of both ways but think that a database storage is

- easier to sell to a customer because they trust in databases since
  many decades now

- backup: there are many solutions out there and the databases are
  already backuped at customer sites - so no extra effort

- more scalable: databases have been tuned for large amounts of data
 (especialy small entities. we all now that BLOBs kill a DBMS system)

I would be fine with a filesystem storage, if replication (fulltransactional over the cluster) is available. But this has to be totalytransparent to the JCR client.

I understand the deployment with more JCR repositories each holding asubset of data for a specific user group and some shared, replicateddata that does not change frequently. But to support this, you have togroup users which is extremly hard especially in our planed application.

There would be a hybrid solution too: store structure info andattributes to a DBMS and BLOBs to the filesystem. The project "daisy" isjust using this aproach. (http://new.cocoondev.org/daisy/index.html)

Are there any efforts to make jackrabbit clustered for a load sharing
scenario (no session failover at repository layer) ?
i think there are a couple of caches that need to be madeclusterable (or at least pluggable) in the jackrabbit core forthat to happen efficiently, it has to be done very carefully,but it should not be to much work i think.
this is definitely on the roadmap and investigations into that
direction have already happend.


is there any information around about these investigations?

From what I have seen making the cache implementation pluggeablewould be a good necessary first step. It then becomes possible to

> use OSCache, JBossTreeCache or Tangosol Coherence that all handle
> clustered caches.

I have been thinking about the same aproach. I like the plugin conceptbecause you can better tweak jackrabbit to the current situation.

After reading a lot of code, I think following changes should do it:
- extending ObservationManager to send and receive Events to
 and from other nodes


maybe... personally i would like to have that functionality closer
to the core, to keep things as transactional as possible across
the cluster.

Its ok - the closer to the core the more transparent the solution is forother parts of jackrabbit. What would you recommend?

- implementing/extending an ORM Layer (Hibernate with shared caching for
 performance). The persistence implementation should be aware of the
 node types and allow a type specific mapping to tables. So we can map
 nodetypes with many instances to own tables while maintaining
 flexibility for new "simple" nodetypes.


i think that you may get a better performance impact by implementing
the shared cache on higher layer in the jackrabbit architecture.

on a completely different note, some people probably also like to mapnodetypes to tables for "aesthetic" reasons...

One quick note about the current ORM implementation. The currentimplementation that I've worked on with Jackrabbit can be improved.

> Feel free to have a look and contribute ! But what David is saying
> is true : for performance, the higher you can cache, the better !

I am glad that you already invested so much time for a base I can workon. I like your solution but would prefer making the mappingconfigurable on a per NodeType base. I just started working on this.

What else should be synchronized between the nodes?
Did I overlook something?


i think this list sounds like a good start...

Can someone explain me the decison making process in the project? How dowe find a suggestion for these modifications?


cheers,

Walter

Re: Scalability/Clustering

Reply via email to