Re: Replication using totem protocol

Jules Gosnell Thu, 02 Feb 2006 07:22:53 -0800

Andy Piper wrote:

At 09:25 AM 1/18/2006, Jules Gosnell wrote:
I haven't been able to convince myself to take the quorum approachbecause...
shared-something approach:
- the shared something is a Single Point of Failure (SPoF) - althoughyou could use an HA something.
That's how WAS and WLS do it. Use an HA database, SAN or dual-portedscsi. The latter is cheap. The former are probably already availableto customers if they really care about availability.

Well, I guess we will have to consider making something along theselines available... - I guess we need a pluggable QuorumStrategy.

- If the node holding the lock 'goes crazy', but does not die, therest of the
This is generally why you use leases. Then your craziness is onlybelieved for a fixed amount of time.


Understood.

cluster becomes a fragment - so it becomes an SPoF as well.
- used in isolation, it does not take into account that the lock maybe held by the smallest cluster fragment
You generally solve this again with leases. i.e. a lock that is validfor some period.

i don't follow you here - but we have lost quite a bit of context. ithink that I was saying that if the fragment that owned theshared-something was the smaller of the two, then 'freezing' the largerfragment would not be optimal - but, I guess you could use theshared-something to negotiate between the two fragments and decide whichto freeze and which to allow to continue...


I don't see leases helping here - but maybe i have mitaken the context ?

shared-nothing approach:
Nice in theory but tricky to implement well. Consensus works well here.
- I prefer this approach, but, as you have stated, if the two halvesare equally sized...
- What if there are two concurrent fractures (does this happen?)
- ActiveCluster notifies you of one membership change at a time - soyou would have to decide on an algorithm for 'chunking' node loss, sothat you could decide when a fragmentation had occurred...
If you really want to do this reliably you have to assume that AC willsend you bogus notifications. Ideally you want to achieve a consensuson membership to avoid this. It sounds like totem solves some of theseissues.

Totem does seem to have some advanced consensus stuff, which, I am?assuming?, relies on its virtual synchrony. This stuff would probablybe very useful under ActiveCluster to manage membership change andpartition notifications, as it would, I understand, guarantee that everynode received a consistant view of what was going on.

For the peer->peer messaging aspect of AC (1->1 and 1->all), I don'tthink VS is required. In fact it might be an unwelcome overhead. I don'tknow enough about the internals of AC and Totem to know if it would bepossible to reuse Totem's VS/consensus stuff on-top-of/along-side AMQse.g. peer:// protocol stack and underneath AC's membership notificationAPI, but it seems to me that ultimately the best solution would be ahybrid, that uses these approaches where needed and not where not...

Have I got the right end of the stick ? Perhaps you can choose whichmessages are virtually synchronous and which are not in Totem ? I ampretty sure though, that it was using muticast, so is not the bestsolution for 1->1 messaging....



Jules

andy




--
"Open Source is a self-assembling organism. You dangle a piece of
string into a super-saturated solution and a whole operating-system
crystallises out around it."

/**********************************
* Jules Gosnell
* Partner
* Core Developers Network (Europe)
*
*    www.coredevelopers.net
*
* Open Source Training & Support.
**********************************/

Re: Replication using totem protocol

Reply via email to