Huang Zhen wrote:
Alan Robertson wrote:
Huang Zhen wrote:
Hi,
Some thoughts about our solution for split-site cluster:
1. the split-site cluster includes one or more sites.
2. a site means a set of nodes in one subnet.
No.
A site means a collection of machines in one location connected by
local area networks which can be made reliable for reasonable cost.
3. a subcluster means a set of nodes which can connected each
other in one site.
This is defined on the http://linux-ha.org/ClusterConcepts page. This
page is a bit mathematical in its definitions, but it says what I mean
for it to say. If you have trouble understanding it, please let me know.
Here is the definitions:
At any given point in time, a cluster is divided into zero or more
subclusters (or partitions).
Each node has a view of subcluster membership, and acknowledges
membership in no more than one of these subclusters.
I think that I can understand it.
However we need deal with split-site.
Do you think one subcluster can include nodes from both two sites?
Normally, you want only one subcluster - total. We are not doing
clusters of clusters here. Just one single cluster which spans sites.
Also known as a split-site cluster.
Having multiple subclusters is a sign of some kind of communications or
algorithm failure. It is a bad thing. You can't prevent this. On the
other hand, if you guarantee that you appoint only one subcluster as
primary (i.e., you only give quorum to one), then you can avoid problems
with resources which get damaged when they are managed by more than one
independent entity (subcluster) at a time.
Or do you think the ha.cf on the node in this site should include the
nodes on the other site?
Absolutely. Or we can use autojoin (provided the routing permits mcast
across the sites).
In the case we only have two sites.
For the near term future, yes.
Eventually, for those kinds of applications which have the right kind of
replication, having 3 or more sites works extremely well. But those
kinds of sites are MUCH rarer than the main site / business continuity
site kind of arrangement.
This kind of site shows up for example, when providing coverage for the
Olympics - when you want to be able to shut down a site for maintenance,
and still have two sites for failover. In this case, all three are
active at the same time, updated in real time from some master Olympic
data source, with DNS tricks being played to keep them all busy, and /24
address ranges for each site, so that you can BGP to take over the
address(es) for the failed/down for maintenance sites.
[In the real-world Internet, routing is never on a finer-grained basis
than /24 - its just not allowed - by universal Internet policy].
Of course, one could always have a corporate intranet which had three
sites - with each one taking over for one of the others in sort of a
ring arrangement. Site A is backed up by B. Site B is backed up by C.
Site C is backed up by A. I have no idea how common this is...
--
Alan Robertson <[EMAIL PROTECTED]>
"Openness is the foundation and preservative of friendship... Let me
claim from you at all times your undisguised opinions." - William
Wilberforce
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/