To all Jahia developers,
In Jahia 4.0.5, we did a complete code audit to check for clustering issues. As this was not the most fun exercise I've ever done in my life, I'd like to avoid having to do it again :) So here are some guidelines I would like respected from now on, in order to guarantee that we stay compatible with clustering, and that you may offer clients in their installation the option of using this kind of deployment.
Point 1 : Make ALL your data objects serializable
---------------------------------------------------
In Jahia 4.0.5 and up, caches and session requires objects inserted into them to be serializable. We're not quite there yet with session objects, but it is the path we are tending to. The web applications will also need to be revised so that they comply to this new restriction. Basically all classes inserted into an HttpSession or a Cache MUST now implement the java.io.Serializable interface. Beware that this also means that objects within objects must also be serializable, or discarded through custom readOject() and writeObject() methods. For cache objects this has already been done throughout Jahia, and session objects will be done in the near future.
Point 2 : Cache objects should always be updated upon modification.
----------------------------------------------------------------------
I already mentioned this point in a previous email. Since we are now using a cache system that transmits both the KEY and the VALUE of the cache entry, the cache must know about value modifications. So basically the code should look something like this :
Object objectValue = cache.get(objectKey);
/* perform objectValue modifications */
cache.put(objectKey, objectValue);
This makes sure that all the nodes in the cluster will have an updated objectValue for the corresponding objectKey.
Point 3 : Beware of object relationships.
-----------------------------------------
On of the really tricky consequences of transmitting both the key and the value is that we have a problem in the following scenario. Let's say we have two classes : class1 and class2. class1 contains a variable of type class2. So we have
class1 -> class2
Now if we only cache class1 type objects, we have no problem. If we cache class2 objects we will have a problem if class1 objects are long lived, for example if they are cached. For simplicity, let's imagine that class1 and class2 are cached. If we modify class2 on node1, and send the value through the network, node2 will still have a class1 object POINTING TO THE OLD instance of class2. The new value of class2 is correctly inserted in node2's cache, but we also need to update the "parent" object. In Jahia 4.0.5 this is currently done by flushing the parent object manually when these cases are problematic.
This is a difficult problem, that could possibly be solved by using Interceptors or other types of proxies, but please be aware of it when modifiying Jahia code, as debugging this type of problem is tricky.
Point 4 : Avoid file storage as much as possible
-----------------------------------------------
File storage and access is a problem in a clustered environment, because mostly of locking issues, and the required sharing of directories. If possible try to always go either through the database or through a network communication system (I'm currently looking at building a JGroups service in Jahia for this type of communication). Anyway the rule of thumb should be :
If I can avoid doing it with a file I should :)
I will be reviewing contributions for these types of problems, but I think that presenting this in the open is the best course of action. Also there might be some solutions I have overlooked and any feedback on these points would be welcome.
Regards, Serge Huber.
