Re: [ClusterLabs] ubsubscribe

2024-02-12 Thread Antony Stone
On Monday 12 February 2024 at 16:42:06, Bob Marčan via Users wrote: > It should be in the body, not in the subject. According to the headers, it should be in the subject, but not sent to the list address: List-Id: Cluster Labs - All topics related to open-source clustering welcomed

Re: [ClusterLabs] Limit the number of resources starting/stoping in parallel possible?

2023-09-18 Thread Antony Stone
On Monday 18 September 2023 at 16:24:02, Knauf Steffen wrote: > Hi, > > we have multiple Cluster (2 node + quorum setup) with more then 100 > Resources ( 10 x VIP + 90 Microservices) per Node. If the Resources are > stopped/started at the same time the Server is under heavy load, which may >

Re: [ClusterLabs] MySQL cluster with auto failover

2023-09-08 Thread Antony Stone
On Thursday 07 September 2023 at 22:06:25, Damiano Giuliani wrote: > Everything seems quite clear to me. > > But, having single VIP makes a multimaster replica quite useless. Why? > im thinking about using pacemaker to create a cloned VIP binded to a cloned > HA proxy which is health-checking

Re: [ClusterLabs] MySQL cluster with auto failover

2023-09-07 Thread Antony Stone
On Wednesday 06 September 2023 at 17:01:24, Damiano Giuliani wrote: > Everything is clear now. > So the point is to use pacemaker and create the floating vip and bind it to > sqlproxy to health check and route the traffic to the available and healthy > galera nodes. Good summary. > It could be

Re: [ClusterLabs] MySQL cluster with auto failover

2023-09-06 Thread Antony Stone
On Wednesday 06 September 2023 at 13:58:51, Damiano Giuliani wrote: > What I miss is how my application can support the connection on a multi > master where 3 ips are available simultaneously. > > JDBCmysql driver or similar support a list name/ip of clustered nodes? > Galera provide a unique

Re: [ClusterLabs] MySQL cluster with auto failover

2023-09-06 Thread Antony Stone
On Wednesday 06 September 2023 at 12:50:40, Damiano Giuliani wrote: > Looking at some Galera cluster designs on web seems a couple of server > proxy are placed in front. You can certainly do it that way, although some people simply have a floating virtual IP across the 3 nodes, and clients

Re: [ClusterLabs] MySQL cluster with auto failover

2023-09-06 Thread Antony Stone
On Wednesday 06 September 2023 at 12:10:23, Damiano Giuliani wrote: > Thanks for helping me. > > I'm going to know more about Galera. > What I don't like is seems I need many nodes, at least 3 for the cluster > and then at least 2 other nodes for proxy. You didn't mention anything about wanting

Re: [ClusterLabs] MySQL cluster with auto failover

2023-09-06 Thread Antony Stone
On Wednesday 06 September 2023 at 11:23:54, Damiano Giuliani wrote: > Thanks for helping. > > Because I still don't know which version will be provided, probably MySQL > enterprise or community. I believe both support Galera replication. > I was wondering about pacemaker because I know quite

Re: [ClusterLabs] MySQL cluster with auto failover

2023-09-06 Thread Antony Stone
On Tuesday 05 September 2023 at 22:20:36, Damiano Giuliani wrote: > Hi guys, I'm about to figure out how setup a pacemaker cluster for MySQL > replication. Why do you need pacemaker? Why not just set up several machines and configure Galera to handle DB replication between them? If you

[ClusterLabs] Load balancing, of a sort

2023-01-25 Thread Antony Stone
Hi. I have a corosync / pacemaker 3-node cluster with a resource group which can run on any node in the cluster. Every night a cron job on the node which is running the resources performs "crm_standby -v on" followed a short while later by "crm_standby -v off" in order to force the resources

Re: [ClusterLabs] Stonith external/ssh "device"?

2022-12-21 Thread Antony Stone
On Wednesday 21 December 2022 at 17:19:34, Antony Stone wrote: > > pacemaker-fenced[3262]: notice: Operation reboot of nodeB by > > for pacemaker-controld.26852@nodeA.93b391b2: No such device > pacemaker-controld[3264]: notice: Peer nodeB was not terminated (reboot)

Re: [ClusterLabs] Stonith external/ssh "device"?

2022-12-21 Thread Antony Stone
On Wednesday 21 December 2022 at 16:59:16, Antony Stone wrote: > Hi. > > I'm implementing fencing on a 7-node cluster as described recently: > https://lists.clusterlabs.org/pipermail/users/2022-December/030714.html > > I'm using external/ssh for the time being, and it works

[ClusterLabs] Stonith external/ssh "device"?

2022-12-21 Thread Antony Stone
Hi. I'm implementing fencing on a 7-node cluster as described recently: https://lists.clusterlabs.org/pipermail/users/2022-December/030714.html I'm using external/ssh for the time being, and it works if I test it using: stonith -t external/ssh -p "nodeA nodeB nodeC" -T reset nodeB However,

Re: [ClusterLabs] Stonith

2022-12-19 Thread Antony Stone
On Monday 19 December 2022 at 13:55:45, Andrei Borzenkov wrote: > On Mon, Dec 19, 2022 at 3:44 PM Antony Stone > > wrote: > > So, do I simply create one stonith resource for each server, and rely on > > some other random server to invoke it when needed? > > Y

[ClusterLabs] Stonith

2022-12-19 Thread Antony Stone
Hi. I have a 7-node corosync / pacemaker cluster which is working nicely as a proof-of-concept. Three machines are in data centre 1, three are in data centre 2, and one machine is in data centre 3. I'm using location contraints to run one set of resources on any of the machines in DC1,

Re: [ClusterLabs] QDevice not found after reboot but appears after cluster restart

2022-07-28 Thread Antony Stone
On Thursday 28 July 2022 at 22:17:01, john tillman wrote: > I have a two cluster setup with a qdevice. 'pcs quorum status' from a > cluster node shows the qdevice casting a vote. On the qdevice node > 'corosync-qnetd-tool -s' says I have 2 connected clients and 1 cluster. > The vote count looks

Re: [ClusterLabs] Question regarding the security of corosync

2022-06-21 Thread Antony Stone
On Friday 17 June 2022 at 11:39:14, Mario Freytag wrote: > I’d like to ask about the security of corosync. We’re using a Proxmox HA > setup in our testing environment and need to confirm it’s compliance with > PCI guidelines. > > We have a few questions: > > Is the communication encrypted? >

Re: [ClusterLabs] ethernet link up/down - ?

2022-02-07 Thread Antony Stone
On Monday 07 February 2022 at 20:09:02, lejeczek via Users wrote: > Hi guys > > How do you guys go about doing link up/down as a resource? I apply or remove addresses on the interface, using "IPaddr2" and "IPv6addr", which I know is not the same thing. Why do you separately want to control

Re: [ClusterLabs] no systemd - ?

2021-12-18 Thread Antony Stone
On Saturday 18 December 2021 at 16:46:55, lejeczek via Users wrote: > hi guys > > I've always been RHE/Fedora user and memories of times before 'systemd' > almost completely vacated my brain - nowadays, is it possible to have HA > without systemd would you know and if so, then how would that

Re: [ClusterLabs] 8 node cluster

2021-09-07 Thread Antony Stone
On Tuesday 07 September 2021 at 19:37:33, M N S H SNGHL wrote: > I am looking for some suggestions here. I have created an 8 node HA cluster > on my SuSE hosts. An even number of nodes is never a good idea. > 1) The resources should work fine even if 7 nodes go down, which means > surviving

Re: [ClusterLabs] Sub‑clusters / super‑clusters - working :)

2021-08-06 Thread Antony Stone
On Friday 06 August 2021 at 15:12:57, Andrei Borzenkov wrote: > On Fri, Aug 6, 2021 at 3:42 PM Antony Stone wrote: > > On Friday 06 August 2021 at 14:14:09, Andrei Borzenkov wrote: > > > > > > If connectivity between (any two) sites is lost you may end up with &g

Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters - working :)

2021-08-06 Thread Antony Stone
On Friday 06 August 2021 at 14:47:03, Ulrich Windl wrote: > Antony Stone schrieb am 06.08.2021 um 14:41 > > > location pref_A GroupA rule ‑inf: site ne cityA > > location pref_B GroupB rule ‑inf: site ne cityB > > I'm wondering whether the first is equivalentto

Re: [ClusterLabs] Sub‑clusters / super‑clusters - working :)

2021-08-06 Thread Antony Stone
On Friday 06 August 2021 at 14:14:09, Andrei Borzenkov wrote: > On Thu, Aug 5, 2021 at 3:44 PM Antony Stone wrote: > > > > For anyone interested in the detail of how to do this (without needing > > booth), here is my cluster.conf file, as in "crm configure loa

Re: [ClusterLabs] Sub‑clusters / super‑clusters - working :)

2021-08-05 Thread Antony Stone
On Thursday 05 August 2021 at 15:44:18, Ulrich Windl wrote: > Hi! > > Nice to hear. What could be "interesting" is how stable the WAN-type of > corosync communication works. Well, between cityA and cityB it should be pretty good, because these are two data centres on opposite sides of England

Re: [ClusterLabs] Sub‑clusters / super‑clusters - working :)

2021-08-05 Thread Antony Stone
On Thursday 05 August 2021 at 10:51:37, Antony Stone wrote: > On Thursday 05 August 2021 at 07:48:37, Ulrich Windl wrote: > > > > Have you ever tried to find out why this happens? (Talking about logs) > > Not in detail, no, but just in case there's a chance of g

Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?

2021-08-05 Thread Antony Stone
On Thursday 05 August 2021 at 07:43:30, Andrei Borzenkov wrote: > On 05.08.2021 00:01, Antony Stone wrote: > > > > Requirements 1, 2 and 3 are easy to achieve - don't connect the clusters. > > > > Requirement 4 is the one I'm stuck with how to implement. > &g

Re: [ClusterLabs] Antw: Re: Antw: [EXT] Re: Sub‑clusters / super‑clusters?

2021-08-05 Thread Antony Stone
On Thursday 05 August 2021 at 07:48:37, Ulrich Windl wrote: > Antony Stone schrieb am 04.08.2021 um 21:27: > > > > As soon as I connect the clusters at city A and city B, and apply the > > location contraints and weighting rules you have suggested: > > > &g

Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?

2021-08-04 Thread Antony Stone
On Wednesday 04 August 2021 at 22:06:39, Frank D. Engel, Jr. wrote: > There is no safe way to do what you are trying to do. > > If the resource is on cluster A and contact is lost between clusters A > and B due to a network failure, how does cluster B know if the resource > is still running on

Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?

2021-08-04 Thread Antony Stone
On Wednesday 04 August 2021 at 20:57:49, Strahil Nikolov wrote: > That's why you need a qdisk at a 3-rd location, so you will have 7 votes in > total.When 3 nodes in cityA die, all resources will be started on the > remaining 3 nodes. I think I have not explained this properly. I have three

Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?

2021-08-04 Thread Antony Stone
On Wednesday 04 August 2021 at 16:07:39, Andrei Borzenkov wrote: > On Wed, Aug 4, 2021 at 5:03 PM Antony Stone wrote: > > On Wednesday 04 August 2021 at 13:31:12, Andrei Borzenkov wrote: > > > On Wed, Aug 4, 2021 at 1:48 PM Antony Stone wrote: > > > > On Tues

Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?

2021-08-04 Thread Antony Stone
On Wednesday 04 August 2021 at 13:31:12, Andrei Borzenkov wrote: > On Wed, Aug 4, 2021 at 1:48 PM Antony Stone wrote: > > On Tuesday 03 August 2021 at 12:12:03, Strahil Nikolov via Users wrote: > > > Won't something like this work ? Each node in LA will have same score > >

Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?

2021-08-04 Thread Antony Stone
On Tuesday 03 August 2021 at 12:12:03, Strahil Nikolov via Users wrote: > Won't something like this work ? Each node in LA will have same score of > 5000, while other cities will be -5000. > > pcs constraint location DummyRes1 rule score=5000 city eq LA > pcs constraint location DummyRes1 rule

Re: [ClusterLabs] Sub-clusters / super-clusters?

2021-08-03 Thread Antony Stone
On Tuesday 11 May 2021 at 12:56:01, Strahil Nikolov wrote: > Here is the example I had promised: > > pcs node attribute server1 city=LA > pcs node attribute server2 city=NY > > # Don't run on any node that is not in LA > pcs constraint location DummyRes1 rule score=-INFINITY city ne LA > >

Re: [ClusterLabs] One Failed Resource = Failover the Cluster?

2021-06-07 Thread Antony Stone
On Monday 07 June 2021 at 21:49:45, Eric Robinson wrote: > > -Original Message- > > From: kgail...@redhat.com > > Sent: Monday, June 7, 2021 2:39 PM > > To: Strahil Nikolov ; Cluster Labs - All topics > > related to open-source clustering welcomed ; Eric > > Robinson > > Subject: Re:

Re: [ClusterLabs] Sub-clusters / super-clusters?

2021-05-10 Thread Antony Stone
hat would be helpful. I did think that a location constraint could be a way to do this, but I wasn't sure how to label three machines in one cluster as a "single location". Any pointers most welcome :) > On Mon, May 10, 2021 at 15:52, Antony Stone wrote: > > On Monday 10 May 2021

Re: [ClusterLabs] Sub-clusters / super-clusters?

2021-05-10 Thread Antony Stone
On Monday 10 May 2021 at 14:41:37, Klaus Wenninger wrote: > On 5/10/21 2:32 PM, Antony Stone wrote: > > Hi. > > > > I'm using corosync 3.0.1 and pacemaker 2.0.1, currently in the following > > way: > > > > I have two separate clusters of three machines

[ClusterLabs] Sub-clusters / super-clusters?

2021-05-10 Thread Antony Stone
Hi. I'm using corosync 3.0.1 and pacemaker 2.0.1, currently in the following way: I have two separate clusters of three machines each, one in a data centre in city A, and one in a data centre in city B. Several of the resources being managed by these clusters are based on floating IP

Re: [ClusterLabs] Question about ping nodes

2021-04-17 Thread Antony Stone
On Saturday 17 April 2021 at 21:41:16, Piotr Kandziora wrote: > Hi, > > Hope some guru will advise here ;) > > I've got two nodes cluster with some resource placement dependent on ping > node visibility ( > https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/ht >

Re: [ClusterLabs] Single-node automated startup question

2021-04-14 Thread Antony Stone
On Wednesday 14 April 2021 at 19:33:39, Strahil Nikolov wrote: > What about a small form factor device to serve as a quorum maker ? > Best Regards,Strahil Nikolov If you're going to take that approach, why not a virtual machine or two, hosted inside the physical machine which is your single

Re: [ClusterLabs] Antw: [EXT] Re: Why my node1 couldn't back to the clustering chain?

2021-04-09 Thread Antony Stone
On Friday 09 April 2021 at 11:06:14, Ulrich Windl wrote: > # lscpu > CPU(s): 144 > # free -h > Mem: 754Gi Nice :) No doubt Jason would like to connect 8 of these together in a cluster... Antony. -- Numerous psychological studies over the years have demonstrated that

Re: [ClusterLabs] Why my node1 couldn't back to the clustering chain?

2021-04-09 Thread Antony Stone
On Friday 09 April 2021 at 10:34:33, Jason Long wrote: > Thanks. > I meant was a Cheat sheet. I don't understand that sentence. > Yes, something like rendering a 3D movie or... . The Corosync and Pacemaker > are not OK for it? What kind of clustering using for rendering? Beowulf > cluster?

Re: [ClusterLabs] Why my node1 couldn't back to the clustering chain?

2021-04-09 Thread Antony Stone
On Friday 09 April 2021 at 08:58:39, Jason Long wrote: > Thank you so much for your great answers. > As the final questions: Really :) ? > 1- Which commands are useful to monitoring and managing my pacemaker > cluster? Some people prefer https://crmsh.github.io/documentation/ and some people

Re: [ClusterLabs] Why my node1 couldn't back to the clustering chain?

2021-04-08 Thread Antony Stone
On Thursday 08 April 2021 at 21:33:48, Jason Long wrote: > Yes, I just wanted to know. In clustering, when a node is down and > go online again, then the cluster will not use it again until another node > fails. Am I right? Think of it like this: You can have as many nodes in your cluster as

Re: [ClusterLabs] Why my node1 couldn't back to the clustering chain?

2021-04-08 Thread Antony Stone
On Thursday 08 April 2021 at 21:33:48, Jason Long wrote: > Yes, I just wanted to know. In clustering, when a node is down and > go online again, then the cluster will not use it again until another node > fails. Am I right? In general, yes - unless you have specified a location contraint for

Re: [ClusterLabs] Why my node1 couldn't back to the clustering chain?

2021-04-08 Thread Antony Stone
On Thursday 08 April 2021 at 21:24:02, Jason Long wrote: > Thanks. > Thus, my cluster uses Node1 when Node2 is down? Judging from your previous emails, you have a two node cluster. What else is it going to use? Antony. -- Anything that improbable is effectively impossible. - Murray

Re: [ClusterLabs] Why my node1 couldn't back to the clustering chain?

2021-04-08 Thread Antony Stone
On Thursday 08 April 2021 at 16:55:47, Ken Gaillot wrote: > On Thu, 2021-04-08 at 14:32 +, Jason Long wrote: > > Why, when node1 is back, then web server still on node2? Why not > > switched? > > By default, there are no preferences as to where a resource should run. > The cluster is free to

Re: [ClusterLabs] Antw: Re: Antw: [EXT] Re: cluster-recheck-interval and failure-timeout

2021-04-07 Thread Antony Stone
On Wednesday 07 April 2021 at 10:40:54, Ulrich Windl wrote: > >>> Ken Gaillot schrieb am 06.04.2021 um 15:58 > > On Tue, 2021-04-06 at 09:15 +0200, Ulrich Windl wrote: > >> Sorry I don't get it: If you have a timestamp for each failure- > >> timeout, what's so hard to put all the fail counts

Re: [ClusterLabs] failure-timeout not working in corosync 2.0.1

2021-03-31 Thread Antony Stone
On Wednesday 31 March 2021 at 23:09:38, Antony Stone wrote: > On Wednesday 31 March 2021 at 22:53:53, Reid Wahl wrote: > > Hi, Antony. failure-timeout should be a resource meta attribute, not an > > attribute of the monitor operation. At least I'm not aware of it being >

Re: [ClusterLabs] failure-timeout not working in corosync 2.0.1

2021-03-31 Thread Antony Stone
On Wednesday 31 March 2021 at 23:11:50, Reid Wahl wrote: > Maybe Pacemaker-1 was looser in its handling of resource meta attributes vs > operation meta attributes. Good question. Returning to my suspicion that it's more likely me that simply did something wrong, what command can I use to find

Re: [ClusterLabs] failure-timeout not working in corosync 2.0.1

2021-03-31 Thread Antony Stone
On Wednesday 31 March 2021 at 22:53:53, Reid Wahl wrote: > Hi, Antony. failure-timeout should be a resource meta attribute, not an > attribute of the monitor operation. At least I'm not aware of it being > configurable per-operation -- maybe it is. Can't check at the moment :) Okay, I'll try

[ClusterLabs] failure-timeout not working in corosync 2.0.1

2021-03-31 Thread Antony Stone
Hi. I've pared my configureation down to almost a bare minimum to demonstrate the problem I'm having. I have two questions: 1. What command can I use to find out what pacemaker thinks my cluster.cib file really means? I know what I put in it, but I want to see what pacemaker has understood

Re: [ClusterLabs] cluster-recheck-interval and failure-timeout

2021-03-31 Thread Antony Stone
On Wednesday 31 March 2021 at 16:58:30, Antony Stone wrote: > I'm only interested in the most recent failure. I'm saying that once that > failure is more than "failure-timeout" seconds old, I want the fact that > the resource failed to be forgotten, so that it can be restarted

Re: [ClusterLabs] cluster-recheck-interval and failure-timeout

2021-03-31 Thread Antony Stone
On Wednesday 31 March 2021 at 15:48:15, Ken Gaillot wrote: > On Wed, 2021-03-31 at 14:32 +0200, Antony Stone wrote: > > > > So, what am I misunderstanding about "failure-timeout", and what > > configuration setting do I need to use to tell pacemaker that "pro

Re: [ClusterLabs] cluster-recheck-interval and failure-timeout

2021-03-31 Thread Antony Stone
On Wednesday 31 March 2021 at 15:48:15, Ken Gaillot wrote: > On Wed, 2021-03-31 at 14:32 +0200, Antony Stone wrote: > > > So, what am I misunderstanding about "failure-timeout", and what > > configuration setting do I need to use to tell pacemaker that "provid

[ClusterLabs] cluster-recheck-interval and failure-timeout

2021-03-31 Thread Antony Stone
Hi. I'm trying to understand what looks to me like incorrect behaviour between cluster-recheck-interval and failure-timeout, under pacemaker 2.0.1 I have three machines in a corosync (3.0.1 if it matters) cluster, managing 12 resources in a single group. I'm following documentation from:

Re: [ClusterLabs] Antw: [EXT] Re: ocf-tester always claims failure, even with built-in resource agents?

2021-03-29 Thread Antony Stone
On Monday 29 March 2021 at 09:03:10, Ulrich Windl wrote: > >> So, that would be an extra parameter to the resource definition in > >> cluster.cib? > >> > >> Change: > >> > >> primitive Asterisk asterisk meta migration-threshold=3 op monitor > >> interval=5 timeout=30 on-fail=restart

Re: [ClusterLabs] ocf-tester always claims failure, even with built-in resource agents?

2021-03-26 Thread Antony Stone
On Friday 26 March 2021 at 18:31:51, Ken Gaillot wrote: > On Fri, 2021-03-26 at 19:59 +0300, Andrei Borzenkov wrote: > > On 26.03.2021 17:28, Antony Stone wrote: > > > > > > So far all is well and good, my cluster synchronises, starts the > > > resources,

Re: [ClusterLabs] ocf-tester always claims failure, even with built-in resource agents?

2021-03-26 Thread Antony Stone
On Friday 26 March 2021 at 17:59:07, Andrei Borzenkov wrote: > On 26.03.2021 17:28, Antony Stone wrote: > > # ocf-tester -n Asterisk /usr/lib/ocf/resource.d/heartbeat/asterisk > > Beginning tests for /usr/lib/ocf/resource.d/heartbeat/asterisk... > > /usr/sbin/ocf-teste

[ClusterLabs] ocf-tester always claims failure, even with built-in resource agents?

2021-03-26 Thread Antony Stone
Hi. I've just signed up to the list. I've been using corosync and pacemaker for several years, mostly under Debian 9, which means: corosync 2.4.2 pacemaker 1.1.16 I've recently upgraded a test cluster to Debian 10, which gives me: corosync 3.0.1 pacemaker