Hi All,

As you know, sometime back that there were some discussion on clustering Axis2. Quite a set of ideas came up, specially from Rajith.

Me and Chathura Ekanayake thought of bringing this discussion back to life by coming up with a proposal to enable context replication in Axis2. Upon this we could come up with mechanisms to support failure recovery and load balancing.

Sorry about the long email. We thought a bit detailed description would give you a clear idea of our proposal.

The main point was to come up with a interface called ClusterManager. This would define a set of methods that an Axis2 instance working as a node in a cluster could use to share its state with other nodes. An implementation of the ClusterManager will be in a particular clustering technology.

At initiating Axis2 will check a property in the axis2.xml to find out weather its working as a node in a cluster. If it is, Axis2 will instantiate the ClusterManager object and call its methods whenever it needs to share information with other nodes in the cluster.

This proposal doesn't worry about replicating OperationContexts and MessageContexts. The main reason for this was the cost the replication of them would produce (But I guess we can discuss more on that). So here we only worry about the replication of ConfigurationContext, ServiceGroupContexts and ServiceContexts.

Here is the ClusterManager interface that we came up with. I've explained the usage of each method after that.


------------------------------------------------------------------------------------------------

public interface ClusterManager {

    void init (ConfigurationContext context);
    void addContext (String contextId,String parentContextId, AbstractContext context);
    void removeContext (String contextId);
    void addProperty (String contextId, String propertyName, Object propertyValue);
    void removeProperty (String contextId, String propertyName);
    Object touchProperty (String contextId, String propertyName);
    void updateState ();
    void farmDescription (AxisDescription axisDescription);
    void removeDescriptionFromFarm (String axisDescriptionName);

}

------------------------------------------------------------------------------------------------


We assume that each Node in the cluster get born with the same configuration. So initially they will have similar axis2.xml files and similar repositories.

To make the methods simple we assumed that every context should come with a id which could be used to uniquely identify it within a given node. For a ServiceGroupContext this can be the serviceGroupContextId. For a service context this can be the combination of the service name and the id of the ServiceGroupContext which it belongs to.

The first method 'init' does the initiation work of the current Node. One task of it will be to join the current node into the cluster.

When a new context get added in a certain node, the 'addContext' method of it will be called. This method should inform the other Nodes about this newly added context and ask them to add similar contexts within them.

Similarly 'removeContext' method will be called when a context get removed from a certain Node. Other Nodes should be informed about this and they should be asked to remove the context with the same Id.

We belive that the state of a context is represented by its property bag. So replication of the state would simply mean the replication of the property bag of each context. The next three methods could be used for this.

As their names imply 'addProperty' and 'removeProperty' methods should inform the other nodes to add and remove properties respectively.

When the state of a property bag is suspected to have been changed that property should be touched. For this the touchProperty method have to be used. For example when the user ask for a certain property from the property bag (using the getProperty method of the abstract context) there is a probability of the state of that property being changed. Properties like that should be touched.

When the updateState method is called the state of all touched properties (as defined in the previous paragraph) should be updated in all other nodes of the cluster.

The last two methods are used to support the farming concept in Axis2. Using these method users will be able to deploy and undeploy services from all the Nodes in the cluster by doing that operation to a single node. For example when the user add a new service to the node1 it will be farmed in all other nodes as well. Also removing of the service can be done from any node which will ask all other nodes to remove that service from their repositories.

Support for Load Balancing and failure detection
---------------------------------------------------------------------
As Synapse is looking into implementing several load balancing algorithms we thought it will be better to use that feature than re-building it. So a head node in a Axis2 cluster will be a Axis2+Synapse combination.

To make this more efficient we can add support for detecting node failures. For this a special module will be engaged in the head node. This will detect the heartbeats send by other nodes to detect failures. Failed nodes will be removed from the URL list of the Synapse load balancer.

We believe there has to be more discussion on many of the ideas we have given above. As I know Rajith and some other Axis2 commitors are coming to Sri Lanka to participate in ApacheCon Asia. May be we can discuss more on the Axis2 hackathon there.


Chamikara

Reply via email to