Re: [ClusterLabs] Sub‑clusters / super‑clusters - working :)

2021-08-07 Thread Strahil Nikolov via Users
>Because Asterisk at cityA is bound to a floating IP address, which is held 
>onone of the three machines in cityA. I can't run Asterisk on all 
>threemachines there because only one of them has the IP address.
That's not true. You can use a cloned IP resource with 'globally-unique=true' 
which runs the IP everywhere, but the cluster determines which node to respond 
(conntrolled via IPTABLES) and the others never reply.
It's quite useful for reducing the time for failover.

Best Regards,Strahil Nikolov___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Sub‑clusters / super‑clusters - working :)

2021-08-06 Thread Antony Stone
On Friday 06 August 2021 at 15:12:57, Andrei Borzenkov wrote:

> On Fri, Aug 6, 2021 at 3:42 PM Antony Stone wrote:
> > On Friday 06 August 2021 at 14:14:09, Andrei Borzenkov wrote:
> > > 
> > > If connectivity between (any two) sites is lost you may end up with
> > > one of A or B going out of quorum.
> > 
> > Agreed.
> > 
> > > While this will stop active resources and restart them on another site,
> > 
> > No.  Resources do not start on the "wrong" site because of:
> > location pref_A GroupA rule -inf: site ne cityA
> > location pref_B GroupB rule -inf: site ne cityB
> > 
> > The resources in GroupA either run in cityA or they do not run at all.
> 
> Where did I say anything about group A or B? You have single resource
> that can migrate between sites
> 
> location no_pref Anywhere rule -inf: site ne cityA and site ne cityB

In fact that rule turns out to be unnecessary, because of:

colocation Ast 100: Anywhere [ GroupA GroupB ]

(apologies for the typo the first time I posted that, corrected in my previous 
reply to this one).

This ensures that the "Anywhere" resource group runs either on the machine 
which is running the "GroupA" group or the one which is running the "GroupB" 
group.  This is an added bonus which I find useful, so that only one machine at 
each site is running all the resources at that site.

> I have no idea what "Asterisk in cityA'' means because I see only one
> resource named Asterisk which is not restricted to a single site
> according to your configuration.

Ah, I see the confusion.  I used Asterisk as a simple resource in my example, 
as the thing I wanted to run just once, somewhere.

In fact, for the real setup, where GroupA and GroupB each comprise 10 
resources, and the Anywhere group comprises two, Asterisk is one of the 10 
resources which do run at both sites.

> The only resource that allegedly can migrate between sites in
> configuration you have shown so far is Asterisk.

Yes, in my example documented here.

> Now you say this resource never migrates between sites.

Yes, for my real configuration, which contains 10 resources (one of which is 
Asterisk) in each of GroupA and GroupB, and is therefore over-complicated to 
quote as a proof-of-concept here.

> I'm not sure how helpful this will be to anyone reading archives because I
> completely lost all track of what you tried to achieve.

That can be expressed very simply:

1. A group of resources named GroupA which either run in cityA or do not run 
at all.

2. A group of resources named GroupB which either run in cityB or do not run 
at all.

3. A group of resources name Anywhere which run in either cityA or cityB but 
not both.


Antony.

-- 
Numerous psychological studies over the years have demonstrated that the 
majority of people genuinely believe they are not like the majority of people.

   Please reply to the list;
 please *don't* CC me.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Sub‑clusters / super‑clusters - working :)

2021-08-06 Thread Andrei Borzenkov
On Fri, Aug 6, 2021 at 3:42 PM Antony Stone
 wrote:
>
> On Friday 06 August 2021 at 14:14:09, Andrei Borzenkov wrote:
>
> > On Thu, Aug 5, 2021 at 3:44 PM Antony Stone wrote:
> > >
> > > For anyone interested in the detail of how to do this (without needing
> > > booth), here is my cluster.conf file, as in "crm configure load replace
> > > cluster.conf":
> > >
> > > 
> > > node tom attribute site=cityA
> > > node dick attribute site=cityA
> > > node harry attribute site=cityA
> > >
> > > node fred attribute site=cityB
> > > node george attribute site=cityB
> > > node ron attribute site=cityB
> > >
> > > primitive A-float IPaddr2 params ip=192.168.32.250 cidr_netmask=24 meta
> > > migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20
> > > on- fail=restart
> > > primitive B-float IPaddr2 params ip=192.168.42.250 cidr_netmask=24 meta
> > > migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20
> > > on- fail=restart
> > > primitive Asterisk asterisk meta migration-threshold=3 failure-timeout=60
> > > op monitor interval=5 timeout=20 on-fail=restart
> > >
> > > group GroupA A-float4  resource-stickiness=100
> > > group GroupB B-float4  resource-stickiness=100
> > > group Anywhere Asterisk resource-stickiness=100
> > >
> > > location pref_A GroupA rule -inf: site ne cityA
> > > location pref_B GroupB rule -inf: site ne cityB
> > > location no_pref Anywhere rule -inf: site ne cityA and site ne cityB
> > >
> > > colocation Ast 100: Anywhere [ cityA cityB ]
> >
> > You define a resource set, but there are no resources cityA or cityB,
> > at least you do not show them. So it is not quite clear what this
> > colocation does.
>
> Apologies - I had used different names in my test setup, and converted them to
> cityA etc for the sake of continuity in this discussion.
>
> That should be:
>
> colocation Ast 100: Anywhere [ GroupA GroupB ]
>
> > > property cib-bootstrap-options: stonith-enabled=no no-quorum-policy=stop
> >
> > If connectivity between (any two) sites is lost you may end up with
> > one of A or B going out of quorum.
>
> Agreed.
>
> > While this will stop active resources and restart them on another site,
>
> No.  Resources do not start on the "wrong" site because of:
>
> location pref_A GroupA rule -inf: site ne cityA
> location pref_B GroupB rule -inf: site ne cityB
>
> The resources in GroupA either run in cityA or they do not run at all.
>

Where did I say anything about group A or B? You have single resource
that can migrate between sites

location no_pref Anywhere rule -inf: site ne cityA and site ne cityB

> > there is no coordination between stopping and starting so for some time
> > resources will be active on both sites. It is up to you to evaluate whether
> > this matters.
>
> Any resource which tried to start at the wrong site would simply fail, because
> the IP addresses involved do not work at the "other" site.
>
> > If this matters your solution does not protect against it.
> >
> > If this does not matter, the usual response is - why do you need a
> > cluster in the first place? Why not simply always run asterisk on both
> > sites all the time?
>
> Because Asterisk at cityA is bound to a floating IP address, which is held on
> one of the three machines in cityA.  I can't run Asterisk on all three
> machines there because only one of them has the IP address.
>

I have no idea what "Asterisk in cityA'' means because I see only one
resource named Asterisk which is not restricted to a single site
according to your configuration.

> Asterisk _does_ normally run on both sites all the time, but only on one
> machine at each site.
>

The only resource that allegedly can migrate between sites in
configuration you have shown so far is Asterisk. Now you say this
resource never migrates between sites. I'm not sure how helpful this
will be to anyone reading archives because I completely lost all track
of what you tried to achieve.

> > > start-failure-is-fatal=false cluster-recheck-interval=60s
> > > 
> > >
> > > Of course, the group definitions are not needed for single resources, but
> > > I shall in practice be using multiple resources which do need groups, so
> > > I wanted to ensure I was creating something which would work with that.
> >
> > > I have tested it by:
> > ...
> > >  - causing a network failure at one city (so it simply disappears without
> > > stopping corosync neatly): the other city continues its resources (plus
> > > the "anywhere" resource), the isolated city stops
> >
> > If the site is completely isolated it probably does not matter whether
> > anything is active there. It is partial connectivity loss where it
> > becomes interesting.
>
> Agreed, however my testing shows that resources which I want running in cityA
> are either running there or they're not (they never move to cityB or cityC),
> similarly for cityB, and the resources I want just a single instance of are
> doing just that, and on the 

Re: [ClusterLabs] Sub‑clusters / super‑clusters - working :)

2021-08-06 Thread Antony Stone
On Friday 06 August 2021 at 14:14:09, Andrei Borzenkov wrote:

> On Thu, Aug 5, 2021 at 3:44 PM Antony Stone wrote:
> > 
> > For anyone interested in the detail of how to do this (without needing
> > booth), here is my cluster.conf file, as in "crm configure load replace
> > cluster.conf":
> > 
> > 
> > node tom attribute site=cityA
> > node dick attribute site=cityA
> > node harry attribute site=cityA
> > 
> > node fred attribute site=cityB
> > node george attribute site=cityB
> > node ron attribute site=cityB
> > 
> > primitive A-float IPaddr2 params ip=192.168.32.250 cidr_netmask=24 meta
> > migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20
> > on- fail=restart
> > primitive B-float IPaddr2 params ip=192.168.42.250 cidr_netmask=24 meta
> > migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20
> > on- fail=restart
> > primitive Asterisk asterisk meta migration-threshold=3 failure-timeout=60
> > op monitor interval=5 timeout=20 on-fail=restart
> > 
> > group GroupA A-float4  resource-stickiness=100
> > group GroupB B-float4  resource-stickiness=100
> > group Anywhere Asterisk resource-stickiness=100
> > 
> > location pref_A GroupA rule -inf: site ne cityA
> > location pref_B GroupB rule -inf: site ne cityB
> > location no_pref Anywhere rule -inf: site ne cityA and site ne cityB
> > 
> > colocation Ast 100: Anywhere [ cityA cityB ]
> 
> You define a resource set, but there are no resources cityA or cityB,
> at least you do not show them. So it is not quite clear what this
> colocation does.

Apologies - I had used different names in my test setup, and converted them to 
cityA etc for the sake of continuity in this discussion.

That should be:

colocation Ast 100: Anywhere [ GroupA GroupB ]

> > property cib-bootstrap-options: stonith-enabled=no no-quorum-policy=stop
> 
> If connectivity between (any two) sites is lost you may end up with
> one of A or B going out of quorum.

Agreed.

> While this will stop active resources and restart them on another site,

No.  Resources do not start on the "wrong" site because of:

location pref_A GroupA rule -inf: site ne cityA
location pref_B GroupB rule -inf: site ne cityB

The resources in GroupA either run in cityA or they do not run at all.

> there is no coordination between stopping and starting so for some time
> resources will be active on both sites. It is up to you to evaluate whether
> this matters.

Any resource which tried to start at the wrong site would simply fail, because 
the IP addresses involved do not work at the "other" site.

> If this matters your solution does not protect against it.
> 
> If this does not matter, the usual response is - why do you need a
> cluster in the first place? Why not simply always run asterisk on both
> sites all the time?

Because Asterisk at cityA is bound to a floating IP address, which is held on 
one of the three machines in cityA.  I can't run Asterisk on all three 
machines there because only one of them has the IP address.

Asterisk _does_ normally run on both sites all the time, but only on one 
machine at each site.

> > start-failure-is-fatal=false cluster-recheck-interval=60s
> > 
> > 
> > Of course, the group definitions are not needed for single resources, but
> > I shall in practice be using multiple resources which do need groups, so
> > I wanted to ensure I was creating something which would work with that.
> 
> > I have tested it by:
> ...
> >  - causing a network failure at one city (so it simply disappears without
> > stopping corosync neatly): the other city continues its resources (plus
> > the "anywhere" resource), the isolated city stops
> 
> If the site is completely isolated it probably does not matter whether
> anything is active there. It is partial connectivity loss where it
> becomes interesting.

Agreed, however my testing shows that resources which I want running in cityA 
are either running there or they're not (they never move to cityB or cityC), 
similarly for cityB, and the resources I want just a single instance of are 
doing just that, and on the same machine at cityA or cityB as the local 
resources are running on.


Thanks for the feedback,


Antony.

-- 
"Measuring average network latency is about as useful as measuring the mean 
temperature of patients in a hospital."

 - Stéphane Bortzmeyer

   Please reply to the list;
 please *don't* CC me.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Sub‑clusters / super‑clusters - working :)

2021-08-06 Thread Andrei Borzenkov
On Thu, Aug 5, 2021 at 3:44 PM Antony Stone
 wrote:
>
> On Thursday 05 August 2021 at 10:51:37, Antony Stone wrote:
>
> > On Thursday 05 August 2021 at 07:48:37, Ulrich Windl wrote:
> > >
> > > Have you ever tried to find out why this happens? (Talking about logs)
> >
> > Not in detail, no, but just in case there's a chance of getting this
> > working as suggested simply using location constraints, I shall look
> > further.
>
> I now have a working solution - thank you to everyone who has helped.
>
> The answer to the problem above was simple - with a 6-node cluster, 3 votes is
> not quorum.
>
> I added a 7th node (in "city C") and adjusted the location constraints to
> ensure that cluster A resources run in city A, cluster B resources run in city
> B, and the "anywhere" resource runs in either city A or city B.
>
> I've even added a colocation constraint to ensure that the "anywhere" resource
> runs on the same machine in either city A or city B as is running the local
> resources there (which wasn't a strict requirement, but is very useful).
>
> For anyone interested in the detail of how to do this (without needing booth),
> here is my cluster.conf file, as in "crm configure load replace cluster.conf":
>
> 
> node tom attribute site=cityA
> node dick attribute site=cityA
> node harry attribute site=cityA
>
> node fred attribute site=cityB
> node george attribute site=cityB
> node ron attribute site=cityB
>
> primitive A-float IPaddr2 params ip=192.168.32.250 cidr_netmask=24 meta
> migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20 on-
> fail=restart
> primitive B-float IPaddr2 params ip=192.168.42.250 cidr_netmask=24 meta
> migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20 on-
> fail=restart
> primitive Asterisk asterisk meta migration-threshold=3 failure-timeout=60 op
> monitor interval=5 timeout=20 on-fail=restart
>
> group GroupA A-float4  resource-stickiness=100
> group GroupB B-float4  resource-stickiness=100
> group Anywhere Asterisk resource-stickiness=100
>
> location pref_A GroupA rule -inf: site ne cityA
> location pref_B GroupB rule -inf: site ne cityB
> location no_pref Anywhere rule -inf: site ne cityA and site ne cityB
>
> colocation Ast 100: Anywhere [ cityA cityB ]
>

You define a resource set, but there are no resources cityA or cityB,
at least you do not show them. So it is not quite clear what this
colocation does.

> property cib-bootstrap-options: stonith-enabled=no no-quorum-policy=stop

If connectivity between (any two) sites is lost you may end up with
one of A or B going out of quorum. While this will stop active
resources and restart them on another site, there is no coordination
between stopping and starting so for some time resources will be
active on both sites. It is up to you to evaluate whether this
matters.

If this matters your solution does not protect against it.

If this does not matter, the usual response is - why do you need a
cluster in the first place? Why not simply always run asterisk on both
sites all the time?


> start-failure-is-fatal=false cluster-recheck-interval=60s
> 
>
> Of course, the group definitions are not needed for single resources, but I
> shall in practice be using multiple resources which do need groups, so I
> wanted to ensure I was creating something which would work with that.
>
> I have tested it by:
>
...
>  - causing a network failure at one city (so it simply disappears without
> stopping corosync neatly): the other city continues its resources (plus the
> "anywhere" resource), the isolated city stops
>

If the site is completely isolated it probably does not matter whether
anything is active there. It is partial connectivity loss where it
becomes interesting.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Sub‑clusters / super‑clusters - working :)

2021-08-05 Thread Antony Stone
On Thursday 05 August 2021 at 15:44:18, Ulrich Windl wrote:

> Hi!
> 
> Nice to hear. What could be "interesting" is how stable the WAN-type of
> corosync communication works.

Well, between cityA and cityB it should be pretty good, because these are two 
data centres on opposite sides of England run by the same hosting provider 
(with private dark fibre between them, not dependent on the Internet).

> If it's not that stable, the cluster could try to fence nodes rather
> frequently. OK, you disabled fencing; maybe it works without.

I'm going to find out :)

> Did you tune the parameters?

No:

a) I only just got it working today :)

b) I got it working on a bunch of VMs in my own personal hosting environment; 
I haven't tried it in the real data centres yet.

At the moment I regard it as a Proof of Concept to show that the design works.


Antony.

-- 
Heisenberg, Gödel, and Chomsky walk in to a bar.
Heisenberg says, "Clearly this is a joke, but how can we work out if it's 
funny or not?"
Gödel replies, "We can't know that because we're inside the joke."
Chomsky says, "Of course it's funny. You're just saying it wrong."

   Please reply to the list;
 please *don't* CC me.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Sub‑clusters / super‑clusters - working :)

2021-08-05 Thread Antony Stone
On Thursday 05 August 2021 at 10:51:37, Antony Stone wrote:

> On Thursday 05 August 2021 at 07:48:37, Ulrich Windl wrote:
> > 
> > Have you ever tried to find out why this happens? (Talking about logs)
> 
> Not in detail, no, but just in case there's a chance of getting this
> working as suggested simply using location constraints, I shall look
> further.

I now have a working solution - thank you to everyone who has helped.

The answer to the problem above was simple - with a 6-node cluster, 3 votes is 
not quorum.

I added a 7th node (in "city C") and adjusted the location constraints to 
ensure that cluster A resources run in city A, cluster B resources run in city 
B, and the "anywhere" resource runs in either city A or city B.

I've even added a colocation constraint to ensure that the "anywhere" resource 
runs on the same machine in either city A or city B as is running the local 
resources there (which wasn't a strict requirement, but is very useful).

For anyone interested in the detail of how to do this (without needing booth), 
here is my cluster.conf file, as in "crm configure load replace cluster.conf":


node tom attribute site=cityA
node dick attribute site=cityA
node harry attribute site=cityA

node fred attribute site=cityB
node george attribute site=cityB
node ron attribute site=cityB

primitive A-float IPaddr2 params ip=192.168.32.250 cidr_netmask=24 meta 
migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20 on-
fail=restart
primitive B-float IPaddr2 params ip=192.168.42.250 cidr_netmask=24 meta 
migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20 on-
fail=restart
primitive Asterisk asterisk meta migration-threshold=3 failure-timeout=60 op 
monitor interval=5 timeout=20 on-fail=restart

group GroupA A-float4  resource-stickiness=100
group GroupB B-float4  resource-stickiness=100
group Anywhere Asterisk resource-stickiness=100

location pref_A GroupA rule -inf: site ne cityA
location pref_B GroupB rule -inf: site ne cityB
location no_pref Anywhere rule -inf: site ne cityA and site ne cityB

colocation Ast 100: Anywhere [ cityA cityB ]

property cib-bootstrap-options: stonith-enabled=no no-quorum-policy=stop 
start-failure-is-fatal=false cluster-recheck-interval=60s


Of course, the group definitions are not needed for single resources, but I 
shall in practice be using multiple resources which do need groups, so I 
wanted to ensure I was creating something which would work with that.

I have tested it by:

 - bringing up one node at a time: as soon as any 4 nodes are running, all 
possible resources are running

 - bringing up 5 or more nodes: all resources run

 - taking down one node at a time to a maximum of three nodes offline: if at 
least one node in a given city is running, the resources at that city are 
running

 - turning off (using "halt", so that corosync dies nicely) all three nodes in 
a city simultaneously: that city's resources stop running, the other city 
continues working, as well as the "anywhere" resource

 - causing a network failure at one city (so it simply disappears without 
stopping corosync neatly): the other city continues its resources (plus the 
"anywhere" resource), the isolated city stops

For me, this is the solution I wanted, and in fact it's even slightly better 
than the previous two isolated 3-node clusters I had, because I can now have 
resources running on a single active node in cityA (provided it can see at 
least 3 other nodes in cityB or cityC), which wasn't possible before.


Once again, thanks to everyone who has helped me to achieve this result :)


Antony.

-- 
"The future is already here.   It's just not evenly distributed yet."

 - William Gibson

   Please reply to the list;
 please *don't* CC me.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Sub-clusters / super-clusters?

2021-08-04 Thread Jan Friesse

On 03/08/2021 10:40, Antony Stone wrote:

On Tuesday 11 May 2021 at 12:56:01, Strahil Nikolov wrote:


Here is the example I had promised:

pcs node attribute server1 city=LA
pcs node attribute server2 city=NY

# Don't run on any node that is not in LA
pcs constraint location DummyRes1 rule score=-INFINITY city ne LA

#Don't run on any node that is not in NY
pcs constraint location DummyRes2 rule score=-INFINITY city ne NY

The idea is that if you add a node and you forget to specify the attribute
with the name 'city' , DummyRes1 & DummyRes2 won't be started on it.

For resources that do not have a constraint based on the city -> they will
run everywhere unless you specify a colocation constraint between the
resources.


Excellent - thanks.  I happen to use crmsh rather than pcs, but I've adapted
the above and got it working.

Unfortunately, there is a problem.

My current setup is:

One 3-machine cluster in city A running a bunch of resources between them, the
most important of which for this discussion is Asterisk telephony.

One 3-machine cluster in city B doing exactly the same thing.

The two clusters have no knowledge of each other.

I have high-availability routing between my clusters and my upstream telephony
provider, such that a call can be handled by Cluster A or Cluster B, and if
one is unavailable, the call gets routed to the other.

Thus, a total failure of Cluster A means I still get phone calls, via Cluster
B.


To implement the above "one resource which can run anywhere, but only a single
instance", I joined together clusters A and B, and placed the corresponding
location constraints on the resources I want only at A and the ones I want
only at B.  I then added the resource with no location constraint, and it runs
anywhere, just once.

So far, so good.


The problem is:

With the two independent clusters, if two machines in city A fail, then
Cluster A fails completely (no quorum), and Cluster B continues working.  That
means I still get phone calls.

With the new setup, if two machines in city A fail, then _both_ clusters stop
working and I have no functional resources anywhere.


So, my question now is:

How can I have a 3-machine Cluster A running local resources, and a 3-machine
Cluster B running local resources, plus one resource running on either Cluster
A or Cluster B, but without a failure of one cluster causing _everything_ to
stop?


Yes, it's called geo-clustering (multi-site) - 
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configuring_and_managing_high_availability_clusters/assembly_configuring-multisite-cluster-configuring-and-managing-high-availability-clusters


(ignore doc being for RHEL, other distributions with booth will work 
same way)


Regards,
  Honza




Thanks,


Antony.



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Sub-clusters / super-clusters?

2021-08-03 Thread Andrei Borzenkov
On Tue, Aug 3, 2021 at 11:40 AM Antony Stone
 wrote:
>
> To implement the above "one resource which can run anywhere, but only a single
> instance", I joined together clusters A and B, and placed the corresponding
> location constraints on the resources I want only at A and the ones I want
> only at B.  I then added the resource with no location constraint, and it runs
> anywhere, just once.
>
> So far, so good.
>
>
> The problem is:
>
> With the two independent clusters, if two machines in city A fail, then
> Cluster A fails completely (no quorum), and Cluster B continues working.  That
> means I still get phone calls.
>
> With the new setup, if two machines in city A fail, then _both_ clusters stop
> working and I have no functional resources anywhere.
>

You need to provide more details. All resources running on remaining
nodes should continue to run.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Sub-clusters / super-clusters?

2021-08-03 Thread Klaus Wenninger
On Tue, Aug 3, 2021 at 10:41 AM Antony Stone 
wrote:

> On Tuesday 11 May 2021 at 12:56:01, Strahil Nikolov wrote:
>
> > Here is the example I had promised:
> >
> > pcs node attribute server1 city=LA
> > pcs node attribute server2 city=NY
> >
> > # Don't run on any node that is not in LA
> > pcs constraint location DummyRes1 rule score=-INFINITY city ne LA
> >
> > #Don't run on any node that is not in NY
> > pcs constraint location DummyRes2 rule score=-INFINITY city ne NY
> >
> > The idea is that if you add a node and you forget to specify the
> attribute
> > with the name 'city' , DummyRes1 & DummyRes2 won't be started on it.
> >
> > For resources that do not have a constraint based on the city -> they
> will
> > run everywhere unless you specify a colocation constraint between the
> > resources.
>
> Excellent - thanks.  I happen to use crmsh rather than pcs, but I've
> adapted
> the above and got it working.
>
> Unfortunately, there is a problem.
>
> My current setup is:
>
> One 3-machine cluster in city A running a bunch of resources between them,
> the
> most important of which for this discussion is Asterisk telephony.
>
> One 3-machine cluster in city B doing exactly the same thing.
>
> The two clusters have no knowledge of each other.
>
> I have high-availability routing between my clusters and my upstream
> telephony
> provider, such that a call can be handled by Cluster A or Cluster B, and
> if
> one is unavailable, the call gets routed to the other.
>
> Thus, a total failure of Cluster A means I still get phone calls, via
> Cluster
> B.
>
>
> To implement the above "one resource which can run anywhere, but only a
> single
> instance", I joined together clusters A and B, and placed the
> corresponding
> location constraints on the resources I want only at A and the ones I want
> only at B.  I then added the resource with no location constraint, and it
> runs
> anywhere, just once.
>
> So far, so good.
>
>
> The problem is:
>
> With the two independent clusters, if two machines in city A fail, then
> Cluster A fails completely (no quorum), and Cluster B continues working.
> That
> means I still get phone calls.
>
> With the new setup, if two machines in city A fail, then _both_ clusters
> stop
> working and I have no functional resources anywhere.
>
Why that? If you are talking about quorum a 4-node partition in a 6-node
cluster should be quorate.
Not saying the config is ideal though. Even node number ...
And when city A doesn't see city B you end up with 2 3-node partitions
that aren't quorate without additional measures.
Did you consider booth? Might really be a better match for your problem.

Klaus

>
>
> So, my question now is:
>
> How can I have a 3-machine Cluster A running local resources, and a
> 3-machine
> Cluster B running local resources, plus one resource running on either
> Cluster
> A or Cluster B, but without a failure of one cluster causing _everything_
> to
> stop?
>
>
> Thanks,
>
>
> Antony.
>
> --
> One tequila, two tequila, three tequila, floor.
>
>Please reply to the
> list;
>  please *don't* CC
> me.
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
>
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Sub-clusters / super-clusters?

2021-08-03 Thread Vladislav Bogdanov

Hi
You probably want to look at booth and tickets for a geo-clustering solution.


On August 3, 2021 11:40:54 AM Antony Stone  
wrote:



On Tuesday 11 May 2021 at 12:56:01, Strahil Nikolov wrote:


Here is the example I had promised:

pcs node attribute server1 city=LA
pcs node attribute server2 city=NY

# Don't run on any node that is not in LA
pcs constraint location DummyRes1 rule score=-INFINITY city ne LA

#Don't run on any node that is not in NY
pcs constraint location DummyRes2 rule score=-INFINITY city ne NY

The idea is that if you add a node and you forget to specify the attribute
with the name 'city' , DummyRes1 & DummyRes2 won't be started on it.

For resources that do not have a constraint based on the city -> they will
run everywhere unless you specify a colocation constraint between the
resources.


Excellent - thanks.  I happen to use crmsh rather than pcs, but I've adapted
the above and got it working.

Unfortunately, there is a problem.

My current setup is:

One 3-machine cluster in city A running a bunch of resources between them, the
most important of which for this discussion is Asterisk telephony.

One 3-machine cluster in city B doing exactly the same thing.

The two clusters have no knowledge of each other.

I have high-availability routing between my clusters and my upstream telephony
provider, such that a call can be handled by Cluster A or Cluster B, and if
one is unavailable, the call gets routed to the other.

Thus, a total failure of Cluster A means I still get phone calls, via Cluster
B.


To implement the above "one resource which can run anywhere, but only a single
instance", I joined together clusters A and B, and placed the corresponding
location constraints on the resources I want only at A and the ones I want
only at B.  I then added the resource with no location constraint, and it runs
anywhere, just once.

So far, so good.


The problem is:

With the two independent clusters, if two machines in city A fail, then
Cluster A fails completely (no quorum), and Cluster B continues working.  That
means I still get phone calls.

With the new setup, if two machines in city A fail, then _both_ clusters stop
working and I have no functional resources anywhere.


So, my question now is:

How can I have a 3-machine Cluster A running local resources, and a 3-machine
Cluster B running local resources, plus one resource running on either Cluster
A or Cluster B, but without a failure of one cluster causing _everything_ to
stop?


Thanks,


Antony.

--
One tequila, two tequila, three tequila, floor.

  Please reply to the list;
please *don't* CC me.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Sub-clusters / super-clusters?

2021-08-03 Thread Antony Stone
On Tuesday 11 May 2021 at 12:56:01, Strahil Nikolov wrote:

> Here is the example I had promised:
>
> pcs node attribute server1 city=LA
> pcs node attribute server2 city=NY
>
> # Don't run on any node that is not in LA
> pcs constraint location DummyRes1 rule score=-INFINITY city ne LA
> 
> #Don't run on any node that is not in NY
> pcs constraint location DummyRes2 rule score=-INFINITY city ne NY
>
> The idea is that if you add a node and you forget to specify the attribute
> with the name 'city' , DummyRes1 & DummyRes2 won't be started on it.
> 
> For resources that do not have a constraint based on the city -> they will
> run everywhere unless you specify a colocation constraint between the
> resources.

Excellent - thanks.  I happen to use crmsh rather than pcs, but I've adapted 
the above and got it working.

Unfortunately, there is a problem.

My current setup is:

One 3-machine cluster in city A running a bunch of resources between them, the 
most important of which for this discussion is Asterisk telephony.

One 3-machine cluster in city B doing exactly the same thing.

The two clusters have no knowledge of each other.

I have high-availability routing between my clusters and my upstream telephony 
provider, such that a call can be handled by Cluster A or Cluster B, and if 
one is unavailable, the call gets routed to the other.

Thus, a total failure of Cluster A means I still get phone calls, via Cluster 
B.


To implement the above "one resource which can run anywhere, but only a single 
instance", I joined together clusters A and B, and placed the corresponding 
location constraints on the resources I want only at A and the ones I want 
only at B.  I then added the resource with no location constraint, and it runs 
anywhere, just once.

So far, so good.


The problem is:

With the two independent clusters, if two machines in city A fail, then 
Cluster A fails completely (no quorum), and Cluster B continues working.  That 
means I still get phone calls.

With the new setup, if two machines in city A fail, then _both_ clusters stop 
working and I have no functional resources anywhere.


So, my question now is:

How can I have a 3-machine Cluster A running local resources, and a 3-machine 
Cluster B running local resources, plus one resource running on either Cluster 
A or Cluster B, but without a failure of one cluster causing _everything_ to 
stop?


Thanks,


Antony.

-- 
One tequila, two tequila, three tequila, floor.

   Please reply to the list;
 please *don't* CC me.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Sub-clusters / super-clusters?

2021-05-11 Thread Strahil Nikolov
Here is the example I had promised:
pcs node attribute server1 city=LApcs node attribute server2 city=NY
# Don't run on any node that is not in LApcs constraint location DummyRes1 rule 
score=-INFINITY city ne LA

#Don't run on any node that is not in NYpcs constraint location DummyRes2 rule 
score=-INFINITY city ne NY
The idea is that if you add a node and you forget to specify the attribute with 
the name 'city' , DummyRes1 & DummyRes2 won't be started on it.

For resources that do not have a constraint based on the city -> they will run 
everywhere unless you specify a colocation constraint between the resources.
Best Regards,Strahil Nikolov
 
 
  On Mon, May 10, 2021 at 17:53, Antony Stone 
wrote:   On Monday 10 May 2021 at 16:49:07, Strahil Nikolov wrote:

> You can use  node attributes to define in which  city is each host and then
> use a location constraint to control in which city to run/not run the
> resources. I will try to provide an example tomorrow.

Thank you - that would be helpful.

I did think that a location constraint could be a way to do this, but I wasn't 
sure how to label three machines in one cluster as a "single location".

Any pointers most welcome :)

>  On Mon, May 10, 2021 at 15:52, Antony Stone wrote:
> >  On Monday 10 May 2021 at 14:41:37, Klaus Wenninger wrote:
> > On 5/10/21 2:32 PM, Antony Stone wrote:
> > > Hi.
> > > 
> > > I'm using corosync 3.0.1 and pacemaker 2.0.1, currently in the
> > > following way:
> > > 
> > > I have two separate clusters of three machines each, one in a data
> > > centre in city A, and one in a data centre in city B.
> > > 
> > > Several of the resources being managed by these clusters are based on
> > > floating IP addresses, which are tied to the data centre, therefore the
> > > resources in city A can run on any of the three machines there (alfa,
> > > bravo and charlie), but cannot run on any machine in city B (delta,
> > > echo and foxtrot).
> > > 
> > > I now have a need to create a couple of additional resources which can
> > > operate from anywhere, so I'm wondering if there is a way to configure
> > > corosync / pacemaker so that:
> > > 
> > > Machines alfa, bravo, charlie live in city A and manage resources X, Y
> > > and Z between them.
> > > 
> > > Machines delta, echo and foxtrot live in city B and manage resources U,
> > > V and W between them.
> > > 
> > > All of alpha to foxtrot are also in a "super-cluster" managing
> > > resources P and Q, so these two can be running on any of the 6
> > > machines.
> > > 
> > > 
> > > I hope the question is clear.  Is there an answer :) ?
> > 
> > Sounds like a use-case for https://github.com/ClusterLabs/booth
> 
> Interesting - hadn't come across that feature before.
> 
> Thanks - I'll look into further documentation.
> 
> If anyone else has any other suggestions I'm happy to see whether something
> else might work better for my setup.
> 
> 
> Antony.

-- 
90% of networking problems are routing problems.
9 of the remaining 10% are routing problems in the other direction.
The remaining 1% might be something else, but check the routing anyway.

                                                  Please reply to the list;
                                                        please *don't* CC me.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/
  
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Sub-clusters / super-clusters?

2021-05-10 Thread Strahil Nikolov
You can use  node attributes to define in which  city is each host and then use 
a location constraint to control in which city to run/not run the resources.
I will try to provide an example tomorrow.
Best Regards,Strahil Nikolov
 
 
  On Mon, May 10, 2021 at 15:52, Antony Stone 
wrote:   On Monday 10 May 2021 at 14:41:37, Klaus Wenninger wrote:

> On 5/10/21 2:32 PM, Antony Stone wrote:
> > Hi.
> > 
> > I'm using corosync 3.0.1 and pacemaker 2.0.1, currently in the following
> > way:
> > 
> > I have two separate clusters of three machines each, one in a data centre
> > in city A, and one in a data centre in city B.
> > 
> > Several of the resources being managed by these clusters are based on
> > floating IP addresses, which are tied to the data centre, therefore the
> > resources in city A can run on any of the three machines there (alfa,
> > bravo and charlie), but cannot run on any machine in city B (delta, echo
> > and foxtrot).
> > 
> > I now have a need to create a couple of additional resources which can
> > operate from anywhere, so I'm wondering if there is a way to configure
> > corosync / pacemaker so that:
> > 
> > Machines alfa, bravo, charlie live in city A and manage resources X, Y
> > and Z between them.
> > 
> > Machines delta, echo and foxtrot live in city B and manage resources U, V
> > and W between them.
> > 
> > All of alpha to foxtrot are also in a "super-cluster" managing resources
> > P and Q, so these two can be running on any of the 6 machines.
> > 
> > 
> > I hope the question is clear.  Is there an answer :) ?
> 
> Sounds like a use-case for https://github.com/ClusterLabs/booth

Interesting - hadn't come across that feature before.

Thanks - I'll look into further documentation.

If anyone else has any other suggestions I'm happy to see whether something 
else might work better for my setup.


Antony.

-- 
What do you get when you cross a joke with a rhetorical question?

                                                  Please reply to the list;
                                                        please *don't* CC me.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/
  
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Sub-clusters / super-clusters?

2021-05-10 Thread Antony Stone
On Monday 10 May 2021 at 16:49:07, Strahil Nikolov wrote:

> You can use  node attributes to define in which  city is each host and then
> use a location constraint to control in which city to run/not run the
> resources. I will try to provide an example tomorrow.

Thank you - that would be helpful.

I did think that a location constraint could be a way to do this, but I wasn't 
sure how to label three machines in one cluster as a "single location".

Any pointers most welcome :)

>   On Mon, May 10, 2021 at 15:52, Antony Stone wrote:
> >   On Monday 10 May 2021 at 14:41:37, Klaus Wenninger wrote:
> > On 5/10/21 2:32 PM, Antony Stone wrote:
> > > Hi.
> > > 
> > > I'm using corosync 3.0.1 and pacemaker 2.0.1, currently in the
> > > following way:
> > > 
> > > I have two separate clusters of three machines each, one in a data
> > > centre in city A, and one in a data centre in city B.
> > > 
> > > Several of the resources being managed by these clusters are based on
> > > floating IP addresses, which are tied to the data centre, therefore the
> > > resources in city A can run on any of the three machines there (alfa,
> > > bravo and charlie), but cannot run on any machine in city B (delta,
> > > echo and foxtrot).
> > > 
> > > I now have a need to create a couple of additional resources which can
> > > operate from anywhere, so I'm wondering if there is a way to configure
> > > corosync / pacemaker so that:
> > > 
> > > Machines alfa, bravo, charlie live in city A and manage resources X, Y
> > > and Z between them.
> > > 
> > > Machines delta, echo and foxtrot live in city B and manage resources U,
> > > V and W between them.
> > > 
> > > All of alpha to foxtrot are also in a "super-cluster" managing
> > > resources P and Q, so these two can be running on any of the 6
> > > machines.
> > > 
> > > 
> > > I hope the question is clear.  Is there an answer :) ?
> > 
> > Sounds like a use-case for https://github.com/ClusterLabs/booth
> 
> Interesting - hadn't come across that feature before.
> 
> Thanks - I'll look into further documentation.
> 
> If anyone else has any other suggestions I'm happy to see whether something
> else might work better for my setup.
> 
> 
> Antony.

-- 
90% of networking problems are routing problems.
9 of the remaining 10% are routing problems in the other direction.
The remaining 1% might be something else, but check the routing anyway.

   Please reply to the list;
 please *don't* CC me.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Sub-clusters / super-clusters?

2021-05-10 Thread Antony Stone
On Monday 10 May 2021 at 14:41:37, Klaus Wenninger wrote:

> On 5/10/21 2:32 PM, Antony Stone wrote:
> > Hi.
> > 
> > I'm using corosync 3.0.1 and pacemaker 2.0.1, currently in the following
> > way:
> > 
> > I have two separate clusters of three machines each, one in a data centre
> > in city A, and one in a data centre in city B.
> > 
> > Several of the resources being managed by these clusters are based on
> > floating IP addresses, which are tied to the data centre, therefore the
> > resources in city A can run on any of the three machines there (alfa,
> > bravo and charlie), but cannot run on any machine in city B (delta, echo
> > and foxtrot).
> > 
> > I now have a need to create a couple of additional resources which can
> > operate from anywhere, so I'm wondering if there is a way to configure
> > corosync / pacemaker so that:
> > 
> > Machines alfa, bravo, charlie live in city A and manage resources X, Y
> > and Z between them.
> > 
> > Machines delta, echo and foxtrot live in city B and manage resources U, V
> > and W between them.
> > 
> > All of alpha to foxtrot are also in a "super-cluster" managing resources
> > P and Q, so these two can be running on any of the 6 machines.
> > 
> > 
> > I hope the question is clear.  Is there an answer :) ?
> 
> Sounds like a use-case for https://github.com/ClusterLabs/booth

Interesting - hadn't come across that feature before.

Thanks - I'll look into further documentation.

If anyone else has any other suggestions I'm happy to see whether something 
else might work better for my setup.


Antony.

-- 
What do you get when you cross a joke with a rhetorical question?

   Please reply to the list;
 please *don't* CC me.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Sub-clusters / super-clusters?

2021-05-10 Thread Klaus Wenninger

On 5/10/21 2:32 PM, Antony Stone wrote:

Hi.

I'm using corosync 3.0.1 and pacemaker 2.0.1, currently in the following way:

I have two separate clusters of three machines each, one in a data centre in
city A, and one in a data centre in city B.

Several of the resources being managed by these clusters are based on floating
IP addresses, which are tied to the data centre, therefore the resources in
city A can run on any of the three machines there (alfa, bravo and charlie),
but cannot run on any machine in city B (delta, echo and foxtrot).

I now have a need to create a couple of additional resources which can operate
from anywhere, so I'm wondering if there is a way to configure corosync /
pacemaker so that:

Machines alfa, bravo, charlie live in city A and manage resources X, Y and Z
between them.

Machines delta, echo and foxtrot live in city B and manage resources U, V and
W between them.

All of alpha to foxtrot are also in a "super-cluster" managing resources P and
Q, so these two can be running on any of the 6 machines.


I hope the question is clear.  Is there an answer :) ?

Sounds like a use-case for
https://github.com/ClusterLabs/booth

Klaus



Thanks,


Antony.



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Sub-clusters / super-clusters?

2021-05-10 Thread Antony Stone
Hi.

I'm using corosync 3.0.1 and pacemaker 2.0.1, currently in the following way:

I have two separate clusters of three machines each, one in a data centre in 
city A, and one in a data centre in city B.

Several of the resources being managed by these clusters are based on floating 
IP addresses, which are tied to the data centre, therefore the resources in 
city A can run on any of the three machines there (alfa, bravo and charlie), 
but cannot run on any machine in city B (delta, echo and foxtrot).

I now have a need to create a couple of additional resources which can operate 
from anywhere, so I'm wondering if there is a way to configure corosync / 
pacemaker so that:

Machines alfa, bravo, charlie live in city A and manage resources X, Y and Z 
between them.

Machines delta, echo and foxtrot live in city B and manage resources U, V and 
W between them.

All of alpha to foxtrot are also in a "super-cluster" managing resources P and 
Q, so these two can be running on any of the 6 machines.


I hope the question is clear.  Is there an answer :) ?


Thanks,


Antony.

-- 
Ramdisk is not an installation procedure.

   Please reply to the list;
 please *don't* CC me.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/