Re: [ClusterLabs] Sub‑clusters / super‑clusters - working :)
>Because Asterisk at cityA is bound to a floating IP address, which is held >onone of the three machines in cityA. I can't run Asterisk on all >threemachines there because only one of them has the IP address. That's not true. You can use a cloned IP resource with 'globally-unique=true' which runs the IP everywhere, but the cluster determines which node to respond (conntrolled via IPTABLES) and the others never reply. It's quite useful for reducing the time for failover. Best Regards,Strahil Nikolov___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Sub‑clusters / super‑clusters - working :)
On Friday 06 August 2021 at 15:12:57, Andrei Borzenkov wrote: > On Fri, Aug 6, 2021 at 3:42 PM Antony Stone wrote: > > On Friday 06 August 2021 at 14:14:09, Andrei Borzenkov wrote: > > > > > > If connectivity between (any two) sites is lost you may end up with > > > one of A or B going out of quorum. > > > > Agreed. > > > > > While this will stop active resources and restart them on another site, > > > > No. Resources do not start on the "wrong" site because of: > > location pref_A GroupA rule -inf: site ne cityA > > location pref_B GroupB rule -inf: site ne cityB > > > > The resources in GroupA either run in cityA or they do not run at all. > > Where did I say anything about group A or B? You have single resource > that can migrate between sites > > location no_pref Anywhere rule -inf: site ne cityA and site ne cityB In fact that rule turns out to be unnecessary, because of: colocation Ast 100: Anywhere [ GroupA GroupB ] (apologies for the typo the first time I posted that, corrected in my previous reply to this one). This ensures that the "Anywhere" resource group runs either on the machine which is running the "GroupA" group or the one which is running the "GroupB" group. This is an added bonus which I find useful, so that only one machine at each site is running all the resources at that site. > I have no idea what "Asterisk in cityA'' means because I see only one > resource named Asterisk which is not restricted to a single site > according to your configuration. Ah, I see the confusion. I used Asterisk as a simple resource in my example, as the thing I wanted to run just once, somewhere. In fact, for the real setup, where GroupA and GroupB each comprise 10 resources, and the Anywhere group comprises two, Asterisk is one of the 10 resources which do run at both sites. > The only resource that allegedly can migrate between sites in > configuration you have shown so far is Asterisk. Yes, in my example documented here. > Now you say this resource never migrates between sites. Yes, for my real configuration, which contains 10 resources (one of which is Asterisk) in each of GroupA and GroupB, and is therefore over-complicated to quote as a proof-of-concept here. > I'm not sure how helpful this will be to anyone reading archives because I > completely lost all track of what you tried to achieve. That can be expressed very simply: 1. A group of resources named GroupA which either run in cityA or do not run at all. 2. A group of resources named GroupB which either run in cityB or do not run at all. 3. A group of resources name Anywhere which run in either cityA or cityB but not both. Antony. -- Numerous psychological studies over the years have demonstrated that the majority of people genuinely believe they are not like the majority of people. Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Sub‑clusters / super‑clusters - working :)
On Fri, Aug 6, 2021 at 3:42 PM Antony Stone wrote: > > On Friday 06 August 2021 at 14:14:09, Andrei Borzenkov wrote: > > > On Thu, Aug 5, 2021 at 3:44 PM Antony Stone wrote: > > > > > > For anyone interested in the detail of how to do this (without needing > > > booth), here is my cluster.conf file, as in "crm configure load replace > > > cluster.conf": > > > > > > > > > node tom attribute site=cityA > > > node dick attribute site=cityA > > > node harry attribute site=cityA > > > > > > node fred attribute site=cityB > > > node george attribute site=cityB > > > node ron attribute site=cityB > > > > > > primitive A-float IPaddr2 params ip=192.168.32.250 cidr_netmask=24 meta > > > migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20 > > > on- fail=restart > > > primitive B-float IPaddr2 params ip=192.168.42.250 cidr_netmask=24 meta > > > migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20 > > > on- fail=restart > > > primitive Asterisk asterisk meta migration-threshold=3 failure-timeout=60 > > > op monitor interval=5 timeout=20 on-fail=restart > > > > > > group GroupA A-float4 resource-stickiness=100 > > > group GroupB B-float4 resource-stickiness=100 > > > group Anywhere Asterisk resource-stickiness=100 > > > > > > location pref_A GroupA rule -inf: site ne cityA > > > location pref_B GroupB rule -inf: site ne cityB > > > location no_pref Anywhere rule -inf: site ne cityA and site ne cityB > > > > > > colocation Ast 100: Anywhere [ cityA cityB ] > > > > You define a resource set, but there are no resources cityA or cityB, > > at least you do not show them. So it is not quite clear what this > > colocation does. > > Apologies - I had used different names in my test setup, and converted them to > cityA etc for the sake of continuity in this discussion. > > That should be: > > colocation Ast 100: Anywhere [ GroupA GroupB ] > > > > property cib-bootstrap-options: stonith-enabled=no no-quorum-policy=stop > > > > If connectivity between (any two) sites is lost you may end up with > > one of A or B going out of quorum. > > Agreed. > > > While this will stop active resources and restart them on another site, > > No. Resources do not start on the "wrong" site because of: > > location pref_A GroupA rule -inf: site ne cityA > location pref_B GroupB rule -inf: site ne cityB > > The resources in GroupA either run in cityA or they do not run at all. > Where did I say anything about group A or B? You have single resource that can migrate between sites location no_pref Anywhere rule -inf: site ne cityA and site ne cityB > > there is no coordination between stopping and starting so for some time > > resources will be active on both sites. It is up to you to evaluate whether > > this matters. > > Any resource which tried to start at the wrong site would simply fail, because > the IP addresses involved do not work at the "other" site. > > > If this matters your solution does not protect against it. > > > > If this does not matter, the usual response is - why do you need a > > cluster in the first place? Why not simply always run asterisk on both > > sites all the time? > > Because Asterisk at cityA is bound to a floating IP address, which is held on > one of the three machines in cityA. I can't run Asterisk on all three > machines there because only one of them has the IP address. > I have no idea what "Asterisk in cityA'' means because I see only one resource named Asterisk which is not restricted to a single site according to your configuration. > Asterisk _does_ normally run on both sites all the time, but only on one > machine at each site. > The only resource that allegedly can migrate between sites in configuration you have shown so far is Asterisk. Now you say this resource never migrates between sites. I'm not sure how helpful this will be to anyone reading archives because I completely lost all track of what you tried to achieve. > > > start-failure-is-fatal=false cluster-recheck-interval=60s > > > > > > > > > Of course, the group definitions are not needed for single resources, but > > > I shall in practice be using multiple resources which do need groups, so > > > I wanted to ensure I was creating something which would work with that. > > > > > I have tested it by: > > ... > > > - causing a network failure at one city (so it simply disappears without > > > stopping corosync neatly): the other city continues its resources (plus > > > the "anywhere" resource), the isolated city stops > > > > If the site is completely isolated it probably does not matter whether > > anything is active there. It is partial connectivity loss where it > > becomes interesting. > > Agreed, however my testing shows that resources which I want running in cityA > are either running there or they're not (they never move to cityB or cityC), > similarly for cityB, and the resources I want just a single instance of are > doing just that, and on the
Re: [ClusterLabs] Sub‑clusters / super‑clusters - working :)
On Friday 06 August 2021 at 14:14:09, Andrei Borzenkov wrote: > On Thu, Aug 5, 2021 at 3:44 PM Antony Stone wrote: > > > > For anyone interested in the detail of how to do this (without needing > > booth), here is my cluster.conf file, as in "crm configure load replace > > cluster.conf": > > > > > > node tom attribute site=cityA > > node dick attribute site=cityA > > node harry attribute site=cityA > > > > node fred attribute site=cityB > > node george attribute site=cityB > > node ron attribute site=cityB > > > > primitive A-float IPaddr2 params ip=192.168.32.250 cidr_netmask=24 meta > > migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20 > > on- fail=restart > > primitive B-float IPaddr2 params ip=192.168.42.250 cidr_netmask=24 meta > > migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20 > > on- fail=restart > > primitive Asterisk asterisk meta migration-threshold=3 failure-timeout=60 > > op monitor interval=5 timeout=20 on-fail=restart > > > > group GroupA A-float4 resource-stickiness=100 > > group GroupB B-float4 resource-stickiness=100 > > group Anywhere Asterisk resource-stickiness=100 > > > > location pref_A GroupA rule -inf: site ne cityA > > location pref_B GroupB rule -inf: site ne cityB > > location no_pref Anywhere rule -inf: site ne cityA and site ne cityB > > > > colocation Ast 100: Anywhere [ cityA cityB ] > > You define a resource set, but there are no resources cityA or cityB, > at least you do not show them. So it is not quite clear what this > colocation does. Apologies - I had used different names in my test setup, and converted them to cityA etc for the sake of continuity in this discussion. That should be: colocation Ast 100: Anywhere [ GroupA GroupB ] > > property cib-bootstrap-options: stonith-enabled=no no-quorum-policy=stop > > If connectivity between (any two) sites is lost you may end up with > one of A or B going out of quorum. Agreed. > While this will stop active resources and restart them on another site, No. Resources do not start on the "wrong" site because of: location pref_A GroupA rule -inf: site ne cityA location pref_B GroupB rule -inf: site ne cityB The resources in GroupA either run in cityA or they do not run at all. > there is no coordination between stopping and starting so for some time > resources will be active on both sites. It is up to you to evaluate whether > this matters. Any resource which tried to start at the wrong site would simply fail, because the IP addresses involved do not work at the "other" site. > If this matters your solution does not protect against it. > > If this does not matter, the usual response is - why do you need a > cluster in the first place? Why not simply always run asterisk on both > sites all the time? Because Asterisk at cityA is bound to a floating IP address, which is held on one of the three machines in cityA. I can't run Asterisk on all three machines there because only one of them has the IP address. Asterisk _does_ normally run on both sites all the time, but only on one machine at each site. > > start-failure-is-fatal=false cluster-recheck-interval=60s > > > > > > Of course, the group definitions are not needed for single resources, but > > I shall in practice be using multiple resources which do need groups, so > > I wanted to ensure I was creating something which would work with that. > > > I have tested it by: > ... > > - causing a network failure at one city (so it simply disappears without > > stopping corosync neatly): the other city continues its resources (plus > > the "anywhere" resource), the isolated city stops > > If the site is completely isolated it probably does not matter whether > anything is active there. It is partial connectivity loss where it > becomes interesting. Agreed, however my testing shows that resources which I want running in cityA are either running there or they're not (they never move to cityB or cityC), similarly for cityB, and the resources I want just a single instance of are doing just that, and on the same machine at cityA or cityB as the local resources are running on. Thanks for the feedback, Antony. -- "Measuring average network latency is about as useful as measuring the mean temperature of patients in a hospital." - Stéphane Bortzmeyer Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Sub‑clusters / super‑clusters - working :)
On Thu, Aug 5, 2021 at 3:44 PM Antony Stone wrote: > > On Thursday 05 August 2021 at 10:51:37, Antony Stone wrote: > > > On Thursday 05 August 2021 at 07:48:37, Ulrich Windl wrote: > > > > > > Have you ever tried to find out why this happens? (Talking about logs) > > > > Not in detail, no, but just in case there's a chance of getting this > > working as suggested simply using location constraints, I shall look > > further. > > I now have a working solution - thank you to everyone who has helped. > > The answer to the problem above was simple - with a 6-node cluster, 3 votes is > not quorum. > > I added a 7th node (in "city C") and adjusted the location constraints to > ensure that cluster A resources run in city A, cluster B resources run in city > B, and the "anywhere" resource runs in either city A or city B. > > I've even added a colocation constraint to ensure that the "anywhere" resource > runs on the same machine in either city A or city B as is running the local > resources there (which wasn't a strict requirement, but is very useful). > > For anyone interested in the detail of how to do this (without needing booth), > here is my cluster.conf file, as in "crm configure load replace cluster.conf": > > > node tom attribute site=cityA > node dick attribute site=cityA > node harry attribute site=cityA > > node fred attribute site=cityB > node george attribute site=cityB > node ron attribute site=cityB > > primitive A-float IPaddr2 params ip=192.168.32.250 cidr_netmask=24 meta > migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20 on- > fail=restart > primitive B-float IPaddr2 params ip=192.168.42.250 cidr_netmask=24 meta > migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20 on- > fail=restart > primitive Asterisk asterisk meta migration-threshold=3 failure-timeout=60 op > monitor interval=5 timeout=20 on-fail=restart > > group GroupA A-float4 resource-stickiness=100 > group GroupB B-float4 resource-stickiness=100 > group Anywhere Asterisk resource-stickiness=100 > > location pref_A GroupA rule -inf: site ne cityA > location pref_B GroupB rule -inf: site ne cityB > location no_pref Anywhere rule -inf: site ne cityA and site ne cityB > > colocation Ast 100: Anywhere [ cityA cityB ] > You define a resource set, but there are no resources cityA or cityB, at least you do not show them. So it is not quite clear what this colocation does. > property cib-bootstrap-options: stonith-enabled=no no-quorum-policy=stop If connectivity between (any two) sites is lost you may end up with one of A or B going out of quorum. While this will stop active resources and restart them on another site, there is no coordination between stopping and starting so for some time resources will be active on both sites. It is up to you to evaluate whether this matters. If this matters your solution does not protect against it. If this does not matter, the usual response is - why do you need a cluster in the first place? Why not simply always run asterisk on both sites all the time? > start-failure-is-fatal=false cluster-recheck-interval=60s > > > Of course, the group definitions are not needed for single resources, but I > shall in practice be using multiple resources which do need groups, so I > wanted to ensure I was creating something which would work with that. > > I have tested it by: > ... > - causing a network failure at one city (so it simply disappears without > stopping corosync neatly): the other city continues its resources (plus the > "anywhere" resource), the isolated city stops > If the site is completely isolated it probably does not matter whether anything is active there. It is partial connectivity loss where it becomes interesting. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Sub‑clusters / super‑clusters - working :)
On Thursday 05 August 2021 at 15:44:18, Ulrich Windl wrote: > Hi! > > Nice to hear. What could be "interesting" is how stable the WAN-type of > corosync communication works. Well, between cityA and cityB it should be pretty good, because these are two data centres on opposite sides of England run by the same hosting provider (with private dark fibre between them, not dependent on the Internet). > If it's not that stable, the cluster could try to fence nodes rather > frequently. OK, you disabled fencing; maybe it works without. I'm going to find out :) > Did you tune the parameters? No: a) I only just got it working today :) b) I got it working on a bunch of VMs in my own personal hosting environment; I haven't tried it in the real data centres yet. At the moment I regard it as a Proof of Concept to show that the design works. Antony. -- Heisenberg, Gödel, and Chomsky walk in to a bar. Heisenberg says, "Clearly this is a joke, but how can we work out if it's funny or not?" Gödel replies, "We can't know that because we're inside the joke." Chomsky says, "Of course it's funny. You're just saying it wrong." Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Sub‑clusters / super‑clusters - working :)
On Thursday 05 August 2021 at 10:51:37, Antony Stone wrote: > On Thursday 05 August 2021 at 07:48:37, Ulrich Windl wrote: > > > > Have you ever tried to find out why this happens? (Talking about logs) > > Not in detail, no, but just in case there's a chance of getting this > working as suggested simply using location constraints, I shall look > further. I now have a working solution - thank you to everyone who has helped. The answer to the problem above was simple - with a 6-node cluster, 3 votes is not quorum. I added a 7th node (in "city C") and adjusted the location constraints to ensure that cluster A resources run in city A, cluster B resources run in city B, and the "anywhere" resource runs in either city A or city B. I've even added a colocation constraint to ensure that the "anywhere" resource runs on the same machine in either city A or city B as is running the local resources there (which wasn't a strict requirement, but is very useful). For anyone interested in the detail of how to do this (without needing booth), here is my cluster.conf file, as in "crm configure load replace cluster.conf": node tom attribute site=cityA node dick attribute site=cityA node harry attribute site=cityA node fred attribute site=cityB node george attribute site=cityB node ron attribute site=cityB primitive A-float IPaddr2 params ip=192.168.32.250 cidr_netmask=24 meta migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20 on- fail=restart primitive B-float IPaddr2 params ip=192.168.42.250 cidr_netmask=24 meta migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20 on- fail=restart primitive Asterisk asterisk meta migration-threshold=3 failure-timeout=60 op monitor interval=5 timeout=20 on-fail=restart group GroupA A-float4 resource-stickiness=100 group GroupB B-float4 resource-stickiness=100 group Anywhere Asterisk resource-stickiness=100 location pref_A GroupA rule -inf: site ne cityA location pref_B GroupB rule -inf: site ne cityB location no_pref Anywhere rule -inf: site ne cityA and site ne cityB colocation Ast 100: Anywhere [ cityA cityB ] property cib-bootstrap-options: stonith-enabled=no no-quorum-policy=stop start-failure-is-fatal=false cluster-recheck-interval=60s Of course, the group definitions are not needed for single resources, but I shall in practice be using multiple resources which do need groups, so I wanted to ensure I was creating something which would work with that. I have tested it by: - bringing up one node at a time: as soon as any 4 nodes are running, all possible resources are running - bringing up 5 or more nodes: all resources run - taking down one node at a time to a maximum of three nodes offline: if at least one node in a given city is running, the resources at that city are running - turning off (using "halt", so that corosync dies nicely) all three nodes in a city simultaneously: that city's resources stop running, the other city continues working, as well as the "anywhere" resource - causing a network failure at one city (so it simply disappears without stopping corosync neatly): the other city continues its resources (plus the "anywhere" resource), the isolated city stops For me, this is the solution I wanted, and in fact it's even slightly better than the previous two isolated 3-node clusters I had, because I can now have resources running on a single active node in cityA (provided it can see at least 3 other nodes in cityB or cityC), which wasn't possible before. Once again, thanks to everyone who has helped me to achieve this result :) Antony. -- "The future is already here. It's just not evenly distributed yet." - William Gibson Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Sub-clusters / super-clusters?
On 03/08/2021 10:40, Antony Stone wrote: On Tuesday 11 May 2021 at 12:56:01, Strahil Nikolov wrote: Here is the example I had promised: pcs node attribute server1 city=LA pcs node attribute server2 city=NY # Don't run on any node that is not in LA pcs constraint location DummyRes1 rule score=-INFINITY city ne LA #Don't run on any node that is not in NY pcs constraint location DummyRes2 rule score=-INFINITY city ne NY The idea is that if you add a node and you forget to specify the attribute with the name 'city' , DummyRes1 & DummyRes2 won't be started on it. For resources that do not have a constraint based on the city -> they will run everywhere unless you specify a colocation constraint between the resources. Excellent - thanks. I happen to use crmsh rather than pcs, but I've adapted the above and got it working. Unfortunately, there is a problem. My current setup is: One 3-machine cluster in city A running a bunch of resources between them, the most important of which for this discussion is Asterisk telephony. One 3-machine cluster in city B doing exactly the same thing. The two clusters have no knowledge of each other. I have high-availability routing between my clusters and my upstream telephony provider, such that a call can be handled by Cluster A or Cluster B, and if one is unavailable, the call gets routed to the other. Thus, a total failure of Cluster A means I still get phone calls, via Cluster B. To implement the above "one resource which can run anywhere, but only a single instance", I joined together clusters A and B, and placed the corresponding location constraints on the resources I want only at A and the ones I want only at B. I then added the resource with no location constraint, and it runs anywhere, just once. So far, so good. The problem is: With the two independent clusters, if two machines in city A fail, then Cluster A fails completely (no quorum), and Cluster B continues working. That means I still get phone calls. With the new setup, if two machines in city A fail, then _both_ clusters stop working and I have no functional resources anywhere. So, my question now is: How can I have a 3-machine Cluster A running local resources, and a 3-machine Cluster B running local resources, plus one resource running on either Cluster A or Cluster B, but without a failure of one cluster causing _everything_ to stop? Yes, it's called geo-clustering (multi-site) - https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configuring_and_managing_high_availability_clusters/assembly_configuring-multisite-cluster-configuring-and-managing-high-availability-clusters (ignore doc being for RHEL, other distributions with booth will work same way) Regards, Honza Thanks, Antony. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Sub-clusters / super-clusters?
On Tue, Aug 3, 2021 at 11:40 AM Antony Stone wrote: > > To implement the above "one resource which can run anywhere, but only a single > instance", I joined together clusters A and B, and placed the corresponding > location constraints on the resources I want only at A and the ones I want > only at B. I then added the resource with no location constraint, and it runs > anywhere, just once. > > So far, so good. > > > The problem is: > > With the two independent clusters, if two machines in city A fail, then > Cluster A fails completely (no quorum), and Cluster B continues working. That > means I still get phone calls. > > With the new setup, if two machines in city A fail, then _both_ clusters stop > working and I have no functional resources anywhere. > You need to provide more details. All resources running on remaining nodes should continue to run. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Sub-clusters / super-clusters?
On Tue, Aug 3, 2021 at 10:41 AM Antony Stone wrote: > On Tuesday 11 May 2021 at 12:56:01, Strahil Nikolov wrote: > > > Here is the example I had promised: > > > > pcs node attribute server1 city=LA > > pcs node attribute server2 city=NY > > > > # Don't run on any node that is not in LA > > pcs constraint location DummyRes1 rule score=-INFINITY city ne LA > > > > #Don't run on any node that is not in NY > > pcs constraint location DummyRes2 rule score=-INFINITY city ne NY > > > > The idea is that if you add a node and you forget to specify the > attribute > > with the name 'city' , DummyRes1 & DummyRes2 won't be started on it. > > > > For resources that do not have a constraint based on the city -> they > will > > run everywhere unless you specify a colocation constraint between the > > resources. > > Excellent - thanks. I happen to use crmsh rather than pcs, but I've > adapted > the above and got it working. > > Unfortunately, there is a problem. > > My current setup is: > > One 3-machine cluster in city A running a bunch of resources between them, > the > most important of which for this discussion is Asterisk telephony. > > One 3-machine cluster in city B doing exactly the same thing. > > The two clusters have no knowledge of each other. > > I have high-availability routing between my clusters and my upstream > telephony > provider, such that a call can be handled by Cluster A or Cluster B, and > if > one is unavailable, the call gets routed to the other. > > Thus, a total failure of Cluster A means I still get phone calls, via > Cluster > B. > > > To implement the above "one resource which can run anywhere, but only a > single > instance", I joined together clusters A and B, and placed the > corresponding > location constraints on the resources I want only at A and the ones I want > only at B. I then added the resource with no location constraint, and it > runs > anywhere, just once. > > So far, so good. > > > The problem is: > > With the two independent clusters, if two machines in city A fail, then > Cluster A fails completely (no quorum), and Cluster B continues working. > That > means I still get phone calls. > > With the new setup, if two machines in city A fail, then _both_ clusters > stop > working and I have no functional resources anywhere. > Why that? If you are talking about quorum a 4-node partition in a 6-node cluster should be quorate. Not saying the config is ideal though. Even node number ... And when city A doesn't see city B you end up with 2 3-node partitions that aren't quorate without additional measures. Did you consider booth? Might really be a better match for your problem. Klaus > > > So, my question now is: > > How can I have a 3-machine Cluster A running local resources, and a > 3-machine > Cluster B running local resources, plus one resource running on either > Cluster > A or Cluster B, but without a failure of one cluster causing _everything_ > to > stop? > > > Thanks, > > > Antony. > > -- > One tequila, two tequila, three tequila, floor. > >Please reply to the > list; > please *don't* CC > me. > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > > ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Sub-clusters / super-clusters?
Hi You probably want to look at booth and tickets for a geo-clustering solution. On August 3, 2021 11:40:54 AM Antony Stone wrote: On Tuesday 11 May 2021 at 12:56:01, Strahil Nikolov wrote: Here is the example I had promised: pcs node attribute server1 city=LA pcs node attribute server2 city=NY # Don't run on any node that is not in LA pcs constraint location DummyRes1 rule score=-INFINITY city ne LA #Don't run on any node that is not in NY pcs constraint location DummyRes2 rule score=-INFINITY city ne NY The idea is that if you add a node and you forget to specify the attribute with the name 'city' , DummyRes1 & DummyRes2 won't be started on it. For resources that do not have a constraint based on the city -> they will run everywhere unless you specify a colocation constraint between the resources. Excellent - thanks. I happen to use crmsh rather than pcs, but I've adapted the above and got it working. Unfortunately, there is a problem. My current setup is: One 3-machine cluster in city A running a bunch of resources between them, the most important of which for this discussion is Asterisk telephony. One 3-machine cluster in city B doing exactly the same thing. The two clusters have no knowledge of each other. I have high-availability routing between my clusters and my upstream telephony provider, such that a call can be handled by Cluster A or Cluster B, and if one is unavailable, the call gets routed to the other. Thus, a total failure of Cluster A means I still get phone calls, via Cluster B. To implement the above "one resource which can run anywhere, but only a single instance", I joined together clusters A and B, and placed the corresponding location constraints on the resources I want only at A and the ones I want only at B. I then added the resource with no location constraint, and it runs anywhere, just once. So far, so good. The problem is: With the two independent clusters, if two machines in city A fail, then Cluster A fails completely (no quorum), and Cluster B continues working. That means I still get phone calls. With the new setup, if two machines in city A fail, then _both_ clusters stop working and I have no functional resources anywhere. So, my question now is: How can I have a 3-machine Cluster A running local resources, and a 3-machine Cluster B running local resources, plus one resource running on either Cluster A or Cluster B, but without a failure of one cluster causing _everything_ to stop? Thanks, Antony. -- One tequila, two tequila, three tequila, floor. Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Sub-clusters / super-clusters?
On Tuesday 11 May 2021 at 12:56:01, Strahil Nikolov wrote: > Here is the example I had promised: > > pcs node attribute server1 city=LA > pcs node attribute server2 city=NY > > # Don't run on any node that is not in LA > pcs constraint location DummyRes1 rule score=-INFINITY city ne LA > > #Don't run on any node that is not in NY > pcs constraint location DummyRes2 rule score=-INFINITY city ne NY > > The idea is that if you add a node and you forget to specify the attribute > with the name 'city' , DummyRes1 & DummyRes2 won't be started on it. > > For resources that do not have a constraint based on the city -> they will > run everywhere unless you specify a colocation constraint between the > resources. Excellent - thanks. I happen to use crmsh rather than pcs, but I've adapted the above and got it working. Unfortunately, there is a problem. My current setup is: One 3-machine cluster in city A running a bunch of resources between them, the most important of which for this discussion is Asterisk telephony. One 3-machine cluster in city B doing exactly the same thing. The two clusters have no knowledge of each other. I have high-availability routing between my clusters and my upstream telephony provider, such that a call can be handled by Cluster A or Cluster B, and if one is unavailable, the call gets routed to the other. Thus, a total failure of Cluster A means I still get phone calls, via Cluster B. To implement the above "one resource which can run anywhere, but only a single instance", I joined together clusters A and B, and placed the corresponding location constraints on the resources I want only at A and the ones I want only at B. I then added the resource with no location constraint, and it runs anywhere, just once. So far, so good. The problem is: With the two independent clusters, if two machines in city A fail, then Cluster A fails completely (no quorum), and Cluster B continues working. That means I still get phone calls. With the new setup, if two machines in city A fail, then _both_ clusters stop working and I have no functional resources anywhere. So, my question now is: How can I have a 3-machine Cluster A running local resources, and a 3-machine Cluster B running local resources, plus one resource running on either Cluster A or Cluster B, but without a failure of one cluster causing _everything_ to stop? Thanks, Antony. -- One tequila, two tequila, three tequila, floor. Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Sub-clusters / super-clusters?
Here is the example I had promised: pcs node attribute server1 city=LApcs node attribute server2 city=NY # Don't run on any node that is not in LApcs constraint location DummyRes1 rule score=-INFINITY city ne LA #Don't run on any node that is not in NYpcs constraint location DummyRes2 rule score=-INFINITY city ne NY The idea is that if you add a node and you forget to specify the attribute with the name 'city' , DummyRes1 & DummyRes2 won't be started on it. For resources that do not have a constraint based on the city -> they will run everywhere unless you specify a colocation constraint between the resources. Best Regards,Strahil Nikolov On Mon, May 10, 2021 at 17:53, Antony Stone wrote: On Monday 10 May 2021 at 16:49:07, Strahil Nikolov wrote: > You can use node attributes to define in which city is each host and then > use a location constraint to control in which city to run/not run the > resources. I will try to provide an example tomorrow. Thank you - that would be helpful. I did think that a location constraint could be a way to do this, but I wasn't sure how to label three machines in one cluster as a "single location". Any pointers most welcome :) > On Mon, May 10, 2021 at 15:52, Antony Stone wrote: > > On Monday 10 May 2021 at 14:41:37, Klaus Wenninger wrote: > > On 5/10/21 2:32 PM, Antony Stone wrote: > > > Hi. > > > > > > I'm using corosync 3.0.1 and pacemaker 2.0.1, currently in the > > > following way: > > > > > > I have two separate clusters of three machines each, one in a data > > > centre in city A, and one in a data centre in city B. > > > > > > Several of the resources being managed by these clusters are based on > > > floating IP addresses, which are tied to the data centre, therefore the > > > resources in city A can run on any of the three machines there (alfa, > > > bravo and charlie), but cannot run on any machine in city B (delta, > > > echo and foxtrot). > > > > > > I now have a need to create a couple of additional resources which can > > > operate from anywhere, so I'm wondering if there is a way to configure > > > corosync / pacemaker so that: > > > > > > Machines alfa, bravo, charlie live in city A and manage resources X, Y > > > and Z between them. > > > > > > Machines delta, echo and foxtrot live in city B and manage resources U, > > > V and W between them. > > > > > > All of alpha to foxtrot are also in a "super-cluster" managing > > > resources P and Q, so these two can be running on any of the 6 > > > machines. > > > > > > > > > I hope the question is clear. Is there an answer :) ? > > > > Sounds like a use-case for https://github.com/ClusterLabs/booth > > Interesting - hadn't come across that feature before. > > Thanks - I'll look into further documentation. > > If anyone else has any other suggestions I'm happy to see whether something > else might work better for my setup. > > > Antony. -- 90% of networking problems are routing problems. 9 of the remaining 10% are routing problems in the other direction. The remaining 1% might be something else, but check the routing anyway. Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Sub-clusters / super-clusters?
You can use node attributes to define in which city is each host and then use a location constraint to control in which city to run/not run the resources. I will try to provide an example tomorrow. Best Regards,Strahil Nikolov On Mon, May 10, 2021 at 15:52, Antony Stone wrote: On Monday 10 May 2021 at 14:41:37, Klaus Wenninger wrote: > On 5/10/21 2:32 PM, Antony Stone wrote: > > Hi. > > > > I'm using corosync 3.0.1 and pacemaker 2.0.1, currently in the following > > way: > > > > I have two separate clusters of three machines each, one in a data centre > > in city A, and one in a data centre in city B. > > > > Several of the resources being managed by these clusters are based on > > floating IP addresses, which are tied to the data centre, therefore the > > resources in city A can run on any of the three machines there (alfa, > > bravo and charlie), but cannot run on any machine in city B (delta, echo > > and foxtrot). > > > > I now have a need to create a couple of additional resources which can > > operate from anywhere, so I'm wondering if there is a way to configure > > corosync / pacemaker so that: > > > > Machines alfa, bravo, charlie live in city A and manage resources X, Y > > and Z between them. > > > > Machines delta, echo and foxtrot live in city B and manage resources U, V > > and W between them. > > > > All of alpha to foxtrot are also in a "super-cluster" managing resources > > P and Q, so these two can be running on any of the 6 machines. > > > > > > I hope the question is clear. Is there an answer :) ? > > Sounds like a use-case for https://github.com/ClusterLabs/booth Interesting - hadn't come across that feature before. Thanks - I'll look into further documentation. If anyone else has any other suggestions I'm happy to see whether something else might work better for my setup. Antony. -- What do you get when you cross a joke with a rhetorical question? Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Sub-clusters / super-clusters?
On Monday 10 May 2021 at 16:49:07, Strahil Nikolov wrote: > You can use node attributes to define in which city is each host and then > use a location constraint to control in which city to run/not run the > resources. I will try to provide an example tomorrow. Thank you - that would be helpful. I did think that a location constraint could be a way to do this, but I wasn't sure how to label three machines in one cluster as a "single location". Any pointers most welcome :) > On Mon, May 10, 2021 at 15:52, Antony Stone wrote: > > On Monday 10 May 2021 at 14:41:37, Klaus Wenninger wrote: > > On 5/10/21 2:32 PM, Antony Stone wrote: > > > Hi. > > > > > > I'm using corosync 3.0.1 and pacemaker 2.0.1, currently in the > > > following way: > > > > > > I have two separate clusters of three machines each, one in a data > > > centre in city A, and one in a data centre in city B. > > > > > > Several of the resources being managed by these clusters are based on > > > floating IP addresses, which are tied to the data centre, therefore the > > > resources in city A can run on any of the three machines there (alfa, > > > bravo and charlie), but cannot run on any machine in city B (delta, > > > echo and foxtrot). > > > > > > I now have a need to create a couple of additional resources which can > > > operate from anywhere, so I'm wondering if there is a way to configure > > > corosync / pacemaker so that: > > > > > > Machines alfa, bravo, charlie live in city A and manage resources X, Y > > > and Z between them. > > > > > > Machines delta, echo and foxtrot live in city B and manage resources U, > > > V and W between them. > > > > > > All of alpha to foxtrot are also in a "super-cluster" managing > > > resources P and Q, so these two can be running on any of the 6 > > > machines. > > > > > > > > > I hope the question is clear. Is there an answer :) ? > > > > Sounds like a use-case for https://github.com/ClusterLabs/booth > > Interesting - hadn't come across that feature before. > > Thanks - I'll look into further documentation. > > If anyone else has any other suggestions I'm happy to see whether something > else might work better for my setup. > > > Antony. -- 90% of networking problems are routing problems. 9 of the remaining 10% are routing problems in the other direction. The remaining 1% might be something else, but check the routing anyway. Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Sub-clusters / super-clusters?
On Monday 10 May 2021 at 14:41:37, Klaus Wenninger wrote: > On 5/10/21 2:32 PM, Antony Stone wrote: > > Hi. > > > > I'm using corosync 3.0.1 and pacemaker 2.0.1, currently in the following > > way: > > > > I have two separate clusters of three machines each, one in a data centre > > in city A, and one in a data centre in city B. > > > > Several of the resources being managed by these clusters are based on > > floating IP addresses, which are tied to the data centre, therefore the > > resources in city A can run on any of the three machines there (alfa, > > bravo and charlie), but cannot run on any machine in city B (delta, echo > > and foxtrot). > > > > I now have a need to create a couple of additional resources which can > > operate from anywhere, so I'm wondering if there is a way to configure > > corosync / pacemaker so that: > > > > Machines alfa, bravo, charlie live in city A and manage resources X, Y > > and Z between them. > > > > Machines delta, echo and foxtrot live in city B and manage resources U, V > > and W between them. > > > > All of alpha to foxtrot are also in a "super-cluster" managing resources > > P and Q, so these two can be running on any of the 6 machines. > > > > > > I hope the question is clear. Is there an answer :) ? > > Sounds like a use-case for https://github.com/ClusterLabs/booth Interesting - hadn't come across that feature before. Thanks - I'll look into further documentation. If anyone else has any other suggestions I'm happy to see whether something else might work better for my setup. Antony. -- What do you get when you cross a joke with a rhetorical question? Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Sub-clusters / super-clusters?
On 5/10/21 2:32 PM, Antony Stone wrote: Hi. I'm using corosync 3.0.1 and pacemaker 2.0.1, currently in the following way: I have two separate clusters of three machines each, one in a data centre in city A, and one in a data centre in city B. Several of the resources being managed by these clusters are based on floating IP addresses, which are tied to the data centre, therefore the resources in city A can run on any of the three machines there (alfa, bravo and charlie), but cannot run on any machine in city B (delta, echo and foxtrot). I now have a need to create a couple of additional resources which can operate from anywhere, so I'm wondering if there is a way to configure corosync / pacemaker so that: Machines alfa, bravo, charlie live in city A and manage resources X, Y and Z between them. Machines delta, echo and foxtrot live in city B and manage resources U, V and W between them. All of alpha to foxtrot are also in a "super-cluster" managing resources P and Q, so these two can be running on any of the 6 machines. I hope the question is clear. Is there an answer :) ? Sounds like a use-case for https://github.com/ClusterLabs/booth Klaus Thanks, Antony. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Sub-clusters / super-clusters?
Hi. I'm using corosync 3.0.1 and pacemaker 2.0.1, currently in the following way: I have two separate clusters of three machines each, one in a data centre in city A, and one in a data centre in city B. Several of the resources being managed by these clusters are based on floating IP addresses, which are tied to the data centre, therefore the resources in city A can run on any of the three machines there (alfa, bravo and charlie), but cannot run on any machine in city B (delta, echo and foxtrot). I now have a need to create a couple of additional resources which can operate from anywhere, so I'm wondering if there is a way to configure corosync / pacemaker so that: Machines alfa, bravo, charlie live in city A and manage resources X, Y and Z between them. Machines delta, echo and foxtrot live in city B and manage resources U, V and W between them. All of alpha to foxtrot are also in a "super-cluster" managing resources P and Q, so these two can be running on any of the 6 machines. I hope the question is clear. Is there an answer :) ? Thanks, Antony. -- Ramdisk is not an installation procedure. Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/