There is no safe way to do what you are trying to do.

If the resource is on cluster A and contact is lost between clusters A and B due to a network failure, how does cluster B know if the resource is still running on cluster A or not?

It has no way of knowing if cluster A is even up and running.

In that situation it cannot safely start the resource.


If the network is down and both clusters come up at the same time, without being able to contact each other, neither knows if the other is running the resource, so neither can safely start it.



On 8/4/21 3:27 PM, Antony Stone wrote:
On Wednesday 04 August 2021 at 20:57:49, Strahil Nikolov wrote:

That's why you need a qdisk at a 3-rd location, so you will have 7 votes in
total.When 3 nodes in cityA die, all resources will be started on the
remaining 3 nodes.
I think I have not explained this properly.

I have three nodes in city A which run resources which have to run in city A.
They are based on IP addresses which are only valid on the network in city A.

I have three nodes in city B which run resources which have to run in city B.
They are based on IP addresses which are only valid on the network in city B.

I have redundant routing between my upstream provider, and cities A and B, so
that I only _need_ resources to be running in one of the two cities for
everything to work as required.  City A can go completely offline and not run
its resources, and everything I need continues to work via city B.

I now have an additional requirement to run a single resource at either city A
or city B but not both.

As soon as I connect the clusters at city A and city B, and apply the location
contraints and weighting rules you have suggested:

1. everything works, including the single resource at either city A or city B,
so long as both clusters are operational.

2. as soon as one cluster fails (all three of its nodes nodes become
unavailable), then the other cluster stops running all its resources as well.
This is even with quorum=2.

This means I have lost the redundancy between my two clusters, which is based
on the expectation that only one cluster will fail at a time.  If the failure
of one automatically _causes_ the failure of the other, I have no high
availability any more.

What I require is for cluster A to continue running its own resources, plus
the single resource which can run anywhere, in the event that cluster B fails.

In other words, I need the exact same outcome as I have at present if cluster
B fails (its resources stop, cluster A is unaffected), except that cluster A
continues to run the single resource which I need just a single instance of.

It is impossible for the nodes at city A to run the resources which should be
running at city B, partly because some of them are identical ("Asterisk" as a
resource, for example, is already running at city A), and partly because some
of them are bound to the networking arrangements (I cannot set a floating IP
address which belongs in city A on a machine which exists in city B - it just
doesn't work).

Therefore if adding a seventh node at a third location would try to start
_all_ resources in city A if city B goes down, it is not a working solution.
If city B goes down then I simply do not want its resources to be running
anywhere, just the same as I have now with the two independent clusters.


Thanks,


Antony.


_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to