+1 .. basically we need to decide at what level to cluster for high
availability. This proposal is that we do it at operation boundaries ..
that is if stuff falls apart in the midst of an operation we can't do
anything about that. If you need that level of reliability then you
should be using Sandesha to give you WS-RM (or a transaction).
Anyway that's the proposal .. and its certainly possible to find a
different point of cut-off but I agree with Chamikara its not possible
to do it all the way and yet be efficient.
In any case, I don't believe the underlying arch and the abstraction
that's proposed will change even if we decide to cluster operation
contexts as well.
Sanjiva.
On Fri, 2006-08-11 at 05:52 +0530, Chamikara Jayalath wrote:
> Hi Chinthaka,
>
> The main reason for not doing this was the cost they would produce.
> Think of a cluster with 5 machines. Every machine getting 5 requests
> per second. We will be trying to replicate 25 Message Contexts every
> second. If all these are InOnly requests that would need the
> replication of 25 more OperationContexts as well. So to support this
> kind of a scenario we will have to replicate 50 contexts every second
> which looks too costly.
>
> On the other hand what would we loose if we declare that we do not
> replicate Operation and Message contexts. This will simply give the
> requirement of messages comming to the same MEP having to be pointed
> to the same node. If that node fails that MEP will be lost which seems
> to be valid enough for me. This can be done by having a RelatesTo
> HashMap at the head node.
>
> Still I am not strictly saying that we should not replicate
> MessageContexts and OperationContexts. We just need to descuss more
> and come to a desicion. If needed this can be fascilitated by doing
> some simple changes to the ClusterManager interface.
>
> Chamikara
>
>
> On 8/11/06, Eran Chinthaka <[EMAIL PROTECTED]> wrote:
> One simple question?
>
> Why did u forget the replication of message contexts and
> operation
> contexts, whilst replicating ConfigurationContexts and the
> others. IIRC,
> operation context map is maintained in the config context.
>
> Another point. Think of IN-IN-OUT mep. You first route the
> first IN
> message to node A of the cluster. Afterwhile, before the
> second IN
> message, node A goes down. How can you cater for this w/o
> catering for
> the replication of operation contexts.
> Well you might argue back and say, IN-IN-OUT mep is not the
> 90% case. If
> thats the case take an RM scenario. From my very little
> knowledge on RM,
> what if node A goes down, having serviced createSequence
> operation.
> Don't we need message context replication from that.
>
> Apologies if I'm missing sth here.
>
> -- Chinthaka
>
> Chamikara Jayalath wrote:
> > Hi All,
> >
> > As you know, sometime back that there were some discussion
> on clustering
> > Axis2. Quite a set of ideas came up, specially from Rajith.
> >
> > Me and Chathura Ekanayake thought of bringing this
> discussion back to
> > life by coming up with a proposal to enable context
> replication in
> > Axis2. Upon this we could come up with mechanisms to support
> failure
> > recovery and load balancing.
> >
> > Sorry about the long email. We thought a bit detailed
> description would
> > give you a clear idea of our proposal.
> >
> > The main point was to come up with a interface called
> ClusterManager.
> > This would define a set of methods that an Axis2 instance
> working as a
> > node in a cluster could use to share its state with other
> nodes. An
> > implementation of the ClusterManager will be in a particular
> clustering
> > technology.
> >
> > At initiating Axis2 will check a property in the axis2.xml
> to find out
> > weather its working as a node in a cluster. If it is, Axis2
> will
> > instantiate the ClusterManager object and call its methods
> whenever it
> > needs to share information with other nodes in the cluster.
> >
> > This proposal doesn't worry about replicating
> OperationContexts and
> > MessageContexts. The main reason for this was the cost the
> replication
> > of them would produce (But I guess we can discuss more on
> that). So here
> > we only worry about the replication of
> ConfigurationContext,
> > ServiceGroupContexts and ServiceContexts.
> >
> > Here is the ClusterManager interface that we came up with.
> I've
> > explained the usage of each method after that.
> >
> >
> >
>
> ------------------------------------------------------------------------------------------------
>
> >
> > public interface ClusterManager {
> >
> > void init (ConfigurationContext context);
> > void addContext (String contextId,String
> parentContextId,
> > AbstractContext context);
> > void removeContext (String contextId);
> > void addProperty (String contextId, String propertyName,
> Object
> > propertyValue);
> > void removeProperty (String contextId, String
> propertyName);
> > Object touchProperty (String contextId, String
> propertyName);
> > void updateState ();
> > void farmDescription (AxisDescription axisDescription);
> > void removeDescriptionFromFarm (String
> axisDescriptionName);
> >
> > }
> >
> >
>
> ------------------------------------------------------------------------------------------------
>
> >
> >
> > We assume that each Node in the cluster get born with the
> same
> > configuration. So initially they will have similar axis2.xml
> files and
> > similar repositories.
> >
> > To make the methods simple we assumed that every context
> should come
> > with a id which could be used to uniquely identify it within
> a given
> > node. For a ServiceGroupContext this can be the
> serviceGroupContextId.
> > For a service context this can be the combination of the
> service name
> > and the id of the ServiceGroupContext which it belongs to.
> >
> > The first method 'init' does the initiation work of the
> current Node.
> > One task of it will be to join the current node into the
> cluster.
> >
> > When a new context get added in a certain node, the
> 'addContext' method
> > of it will be called. This method should inform the other
> Nodes about
> > this newly added context and ask them to add similar
> contexts within them.
> >
> > Similarly 'removeContext' method will be called when a
> context get
> > removed from a certain Node. Other Nodes should be informed
> about this
> > and they should be asked to remove the context with the same
> Id.
> >
> > We belive that the state of a context is represented by its
> property
> > bag. So replication of the state would simply mean the
> replication of
> > the property bag of each context. The next three methods
> could be used
> > for this.
> >
> > As their names imply 'addProperty' and 'removeProperty'
> methods should
> > inform the other nodes to add and remove properties
> respectively.
> >
> > When the state of a property bag is suspected to have been
> changed that
> > property should be touched. For this the touchProperty
> method have to be
> > used. For example when the user ask for a certain property
> from the
> > property bag (using the getProperty method of the abstract
> context)
> > there is a probability of the state of that property being
> changed.
> > Properties like that should be touched.
> >
> > When the updateState method is called the state of all
> touched
> > properties (as defined in the previous paragraph) should be
> updated in
> > all other nodes of the cluster.
> >
> > The last two methods are used to support the farming concept
> in Axis2.
> > Using these method users will be able to deploy and undeploy
> services
> > from all the Nodes in the cluster by doing that operation to
> a single
> > node. For example when the user add a new service to the
> node1 it will
> > be farmed in all other nodes as well. Also removing of the
> service can
> > be done from any node which will ask all other nodes to
> remove that
> > service from their repositories.
> >
> > Support for Load Balancing and failure detection
> >
> ---------------------------------------------------------------------
> > As Synapse is looking into implementing several load
> balancing
> > algorithms we thought it will be better to use that feature
> than
> > re-building it. So a head node in a Axis2 cluster will be a
> > Axis2+Synapse combination.
> >
> > To make this more efficient we can add support for detecting
> node
> > failures. For this a special module will be engaged in the
> head node.
> > This will detect the heartbeats send by other nodes to
> detect failures.
> > Failed nodes will be removed from the URL list of the
> Synapse load balancer.
> >
> > We believe there has to be more discussion on many of the
> ideas we have
> > given above. As I know Rajith and some other Axis2 commitors
> are coming
> > to Sri Lanka to participate in ApacheCon Asia. May be we can
> discuss
> > more on the Axis2 hackathon there.
> >
> >
> > Chamikara
> >
>
>
>
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]