Re: [axis2] Clustering

Steve Loughran Fri, 22 Sep 2006 06:07:45 -0700

Rajith Attapattu wrote:

Hi All,


Chathura and Chamikara have posted the following proposal on the wiki.
http://wiki.apache.org/ws/FrontPage/Axis2/clustering_proposal

Next step is for us to figure out and document the demarcation points where
we would want to cluster.
Then we can start on an implementation.

Regards,

Rajith


Some observations

1. try and use other people's work if you can. Getting clustering towork properly in the face of network failures is hard. We use apartition-aware tuple space for such things; Anubis [1,2].

2. Are you planning on saving state to the servlet context? Its usuallythe simplest way, as the app server vendor will have solved some of theproblems.

3. Servlet state can be brittle against hot redeploy on a single machineif you update the implementations, and very brittle if you do a rollingcold redeploy across a cluster. Unless you can go offline, you need todo choreographed redeploys in which you partition the cluster and havethe load balancer serve the old nodes until the new nodes are up andhave it switched over.

4. Round robin sucks from performance if requests in the same sessionarent biased towards the previous machine, just for cache (including HDD& DB) cache reasons.

5. Round robin needs to use happyaxis.jsp or equivalent to decide whereto route stuff. You cannot just rely on presence of a machine as aliveness cue, you need to monitor the health of the operations.

6. Are your clusters going to be on the same site/network? What are theminimum network requirements, with WLAN and one end, and infiniband atthe other?

7. How are you going to stop system management scaling at O(nodes) orworse. It can be worse unless your diagnostics are good at tracking downwhich machine has a problem, believe me.

8. Testing all of this gets hard indeed. Sometimes we have to resort tomathematical proofs of correctness.

Overall, you need to decide on your goals. Is it scalability oravailability? Both can be done with clustering but you need goodawareness of the problems before you can get it right. A HighAvailability system will be robust against transient network failures,and may or may not support rolling redeployment. More to the point, anunderlying design that is not robust against network outages is veryhard to fix, and stops you doing fun things like downsizing or upsizingthe nodes based on demand, rerouting to different machines based on WS-Ainternals (*) and session info (i.e per-customer and geographic selection),

The other thing is that achieving consistency of behaviour in yourdistributed system is hard. Whoever implementing it needs to be able toargue about Lamport's papers on byzantine generals, or Gray'sexperiences, otherwise they haven't got the background needed to get itright. My own skills in the area are limited, which is why I delegate.But I do know why its hard.

1. Anubis is OSS, on our sourceforge project, so you could use it, butit is LGPL. while we are happy with you calling it from Apache code, I'mnot sure that apache is. If we can come up with animplementation-neutral API, we may be able to implement it and so youcould use it as your way of sharing state across a single-site cluster,preferably one with a decent ethernet behind it.

2. I would think that a back-end neutral SOAP/HTTP load balancer withawareness of back end availability and able to route on WS-A informationis broadly useful to other SOAP stacks, including Axis1.x and Xfire.Maybe it should be a separate project with a JMX management API for liveconfiguration. And before you do it, look at what exists in terms ofHTTP load balancing in the rest of Apache. There's mod-proxy in ApacheHTTPD, and there's Tomcat's own rule-based load-balancer [3].


-Steve

[1,2] http://www.hpl.hp.com/techreports/2005/HPL-2005-72.html
http://www.smartfrog.org/releasedocs/smartfrogdoc/anubis/AnubisUserGuide.pdf

[3]http://tomcat.apache.org/tomcat-5.5-doc/balancer-howto.html#Using%20the%20balancer%20webapp

(*) Load balancing is one reason I dont like WS-A; you need to parse thedoc to find the URL, unless the URL is the only thing you redirect on.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [axis2] Clustering

Reply via email to