Richard Monson-Haefel wrote:
Here's my 2 cents ...
Lets keep it as simple as possible by using sticky sessions and a single node failover. In other words, every node has one buddy to which it replicates state, but each session is assigned to a single node and only in the event of a node failure, does a session move to the buddy node.
The problem here is coordinatio with the LB. The LB needs to know that requests for session xxx are to be directed to the buddy, not the primary, from now on - unless you have the luxury of writing your own LB (perhaps we can involve the mod_jk people) this is not so easy.
If both buddies fail, the session is lost. Buddies can form dynamically as nodes are brought on line and should favor partnerships with nodes on different hardware -- to avoid loosing both buddies in a hardware failure.
I have already figured something like this out.
When a new node comes up, it announces itself. Each other node notices this and passes the existing list of nodes and the new one into an InsertionPolicy object. This is user supplied and can do things like e.g. ensure that if the user has two geographical sites, and wants data to span them, to avoid losing data in the event of losing a whole site, nodes from site A are always juxtaposed (so they will be buddied with) nodes from site B etc... It has to be a user controlled policy, because we cannot foresee all it's possible applications.
IMO, this is the most straight forward solution and the easiest to build and
maintain. It also avoids the chattiness of multi-node replication and the
single point of failure associated with shared store. The one problem with
this solution, IMO, is that you have to deploy clusters in twos - every node
needs to have a buddy if you want failover.
This isn't a problem:
1 buddies with 2, 2 with 3, 3 with 4 and 4 with 1.
buddying ceases to be ?commutative? but becomes more 'backs-up-for'.
It never matters how many nodes you have or how many buddies you want in a group.
now we are back onto the load-balancer...Loads are balanced using a round robin approach at the start of a session.
mod_jk does most of what we need, until fail-over - I will send something to the list about this when I have more time tonight.
Jules
-- /********************************** * Jules Gosnell * Partner * Core Developers Network (Europe) **********************************/
