[ClusterLabs] Pacemaker 2.1.1-rc2 now available

2021-08-06 Thread kgaillot
Hi all, We were able to squeeze in a few more minor bugfixes, so I decided to do an rc2. This will likely be the last release candidate, with the final release in a week or two. Source code for the second release candidate for Pacemaker version 2.1.1 is now available at:

Re: [ClusterLabs] Antw: Re: Antw: [EXT] Re: Sub‑clusters / super‑clusters - working :)

2021-08-06 Thread kgaillot
On Fri, 2021-08-06 at 15:48 +0200, Ulrich Windl wrote: > > > > Andrei Borzenkov schrieb am 06.08.2021 um > > > > 15:14 in > Nachricht > : > > On Fri, Aug 6, 2021 at 3:47 PM Ulrich Windl > > wrote: > > > > > > Antony Stone schrieb am > > > > > > 06.08.2021 um > > > 14:41 in > > > Nachricht

Re: [ClusterLabs] Q: Migration before fencing?

2021-08-06 Thread kgaillot
On Fri, 2021-08-06 at 14:53 +0200, Ulrich Windl wrote: > Hi! > > I had this unecxpected behavior this morning: > A VM resource failed to stop and the cluster node (hypervisor) was > fenced. > However there were VMs runninf that could have been live-migrated. > That wasn't even tried. > > On

Re: [ClusterLabs] Antw: Re: [EXT] Re: Two node cluster without fencing and no split brain?

2021-07-28 Thread kgaillot
On Wed, 2021-07-28 at 08:51 -0400, john tillman wrote: > > Technically you could give one vote to one node and zero to the > > other. > > If they lose contact only the server with one vote would make > > quorum. > > The downside is that if the server with 1 vote goes down the entire > >

[ClusterLabs] Pacemaker 2.1.1-rc1 now available

2021-07-27 Thread kgaillot
Hi all, Source code for the first (and possibly only) release candidate for Pacemaker version 2.1.1 is now available at: https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.1.1-rc1 This main goal of this release is to fix a couple of regressions introduced in 2.1.0 in the

Re: [ClusterLabs] Antw: Re: [EXT] Re: Two node cluster without fencing and no split brain?

2021-07-26 Thread kgaillot
On Mon, 2021-07-26 at 12:21 -0400, john tillman wrote: > > On Mon, Jul 26, 2021 at 4:53 PM john tillman > > wrote: > > > > > > > > > > Maybe explain how it should work: > > > > > If the two nodes cannot rech each other, but each can reach > > > > > the ping > > > > > node, > > > > > which node

Re: [ClusterLabs] 'pcs stonith update' takes, then reverts

2021-07-26 Thread kgaillot
On Mon, 2021-07-26 at 12:25 -0400, Digimer wrote: > On 2021-07-26 9:54 a.m., kgail...@redhat.com wrote: > > On Fri, 2021-07-23 at 21:46 -0400, Digimer wrote: > > > After a LOT of hassle, I finally got it updated, but OMG it was > > > painful. > > > > > > I degraded the cluster (unsure if needed),

Re: [ClusterLabs] 'pcs stonith update' takes, then reverts

2021-07-26 Thread kgaillot
On Fri, 2021-07-23 at 21:46 -0400, Digimer wrote: > After a LOT of hassle, I finally got it updated, but OMG it was > painful. > > I degraded the cluster (unsure if needed), set maintenance mode, > deleted > the stonith levels, deleted the stonith devices, recreated them with > the > updated

Re: [ClusterLabs] [EXT] Re: Two node cluster without fencing and no split brain?

2021-07-22 Thread kgaillot
On Thu, 2021-07-22 at 10:48 -0400, john tillman wrote: > There was a lot of discussion on this topic which might have > overshadowed > this question so I will ask it again in case someone missed it. > > It comes from a post (see below) that we were pointed to here by > Andrei: > > Is there

Re: [ClusterLabs] Antw: [EXT] VIP monitor Timed Out

2021-07-20 Thread kgaillot
On Tue, 2021-07-20 at 09:51 +, PASERO Florent wrote: > Hi, > > Once or twice a week, we have a 'Timed out' on our VIP. > > The last : > Cluster Summary: > * Stack: corosync > * Current DC: server07 (version 2.0.5-9.el8_4.1-ba59be7122) - > partition with quorum > * Last updated: Tue

[ClusterLabs] FYI: regression in crm_attribute/crm_master/crm_failcount in 2.1.0

2021-07-16 Thread kgaillot
Hi all, Thanks to PAF users for testing and finding a regression in crm_attribute (and its crm_master and crm_failcount wrappers) in the recent 2.1.0 release. (CLBZ#5480) (1) Setting a negative attribute value doesn't work (e.g. "crm_attribute -n myattr -v -1000") (2) --get-value (a deprecated

Re: [ClusterLabs] pcs stonith update problems

2021-07-16 Thread kgaillot
On Thu, 2021-07-15 at 18:02 -0400, Digimer wrote: > Hi all, > > I've got a predicament... I want to update a stonith resource to > remove an argument. Specifically, when resource move nodes, I want to > change the stonith delay to favour the new host. This involves adding > the 'delay="x"'

Re: [ClusterLabs] @ maillist Admins - DMARC (yahoo)

2021-07-14 Thread kgaillot
On Wed, 2021-07-14 at 15:28 +0100, lejeczek wrote: > > On 12/07/2021 15:50, kgail...@redhat.com wrote: > > On Sat, 2021-07-10 at 12:34 +0100, lejeczek wrote: > > > Hi Admins(of this mailing list) > > > > > > Could you please fix in DMARC(s) so those of us who are on > > > Yahoo would be able to

Re: [ClusterLabs] pcs update resource command not working

2021-07-14 Thread kgaillot
On Wed, 2021-07-14 at 01:27 +, S Sathish S wrote: > Hi Tomas , > > Thanks for the response. > > As you said ,Specifying an empty value for an option is a syntax for > removing the option. Yes, this means there is no way to set an option > to an empty string value using pcs. > > But our

Re: [ClusterLabs] @ maillist Admins - DMARC (yahoo)

2021-07-13 Thread kgaillot
On Tue, 2021-07-13 at 14:46 +0300, Andrei Borzenkov wrote: > On Mon, Jul 12, 2021 at 5:50 PM wrote: > > > > On Sat, 2021-07-10 at 12:34 +0100, lejeczek wrote: > > > Hi Admins(of this mailing list) > > > > > > Could you please fix in DMARC(s) so those of us who are on > > > Yahoo would be able

Re: [ClusterLabs] Antw: [EXT] Re: @ maillist Admins ‑ DMARC (yahoo)

2021-07-13 Thread kgaillot
On Tue, 2021-07-13 at 10:23 +0200, Ulrich Windl wrote: > > > > schrieb am 12.07.2021 um 16:50 in > > > > Nachricht > > <08471514b28d1e3f6859707f5951f07887336865.ca...@redhat.com>: > > On Sat, 2021‑07‑10 at 12:34 +0100, lejeczek wrote: > > > Hi Admins(of this mailing list) > > > > > > Could you

Re: [ClusterLabs] Q: Prevent non-live VM migration

2021-07-12 Thread kgaillot
On Mon, 2021-07-12 at 08:35 +0200, Ulrich Windl wrote: > Hi! > > We had some problem in the cluster that prevented live migration of > VMs. As a consequence the cluster migrated the VMs using stop/start. > I wonder: Is there a way to prevent stop/start migration if live- > migration failed? The

Re: [ClusterLabs] @ maillist Admins - DMARC (yahoo)

2021-07-12 Thread kgaillot
On Sat, 2021-07-10 at 12:34 +0100, lejeczek wrote: > Hi Admins(of this mailing list) > > Could you please fix in DMARC(s) so those of us who are on > Yahoo would be able to receive own emails/thread. > > many thanks, L. I suppose we should do something, since this is likely to be more of an

Re: [ClusterLabs] pcs update resource command not working

2021-07-09 Thread kgaillot
On Fri, 2021-07-09 at 05:29 +, S Sathish S wrote: > Hi Team, > > we have find the cause of this problem as per below changelog pcs > resource update command doesn’t support empty meta_attributes > anymore. > > https://github.com/ClusterLabs/pcs/blob/0.9.169/CHANGELOG.md > pcs resource

Re: [ClusterLabs] issue with awscli profile for AWS resource agents

2021-07-09 Thread kgaillot
On Thu, 2021-07-08 at 13:18 +, Aaron Kennedy wrote: > > Hello, > > I am trying to use AWS resource agents such as ‘awsvip’ and ‘awseip’ > but my awscli profile “could not be found” > > [ec2-user@ip-172-31-43-116 ~]$ sudo pcs resource debug-start --full > privip

Re: [ClusterLabs] Antw: [EXT] Pacemaker alerts log duplication.

2021-07-09 Thread kgaillot
On Thu, 2021-07-08 at 10:00 +0200, Ulrich Windl wrote: > > > > Amol Shinde schrieb am 08.07.2021 um > > > > 08:58 in > > Nachricht > < > mw3pr20mb3385122ebac1ab9282c3f91ae9...@mw3pr20mb3385.namprd20.prod.outlook.com > > > > > Hello everyone!!! > > Hope you are doing well. > > I need some help

Re: [ClusterLabs] VirtualDomain restart caused fencing.

2021-06-30 Thread kgaillot
On Wed, 2021-06-30 at 08:40 -0700, Matthew Schumacher wrote: > Hello, > > I'm not sure how to fix this, but calling 'crm resource restart vm- > name' this morning caused an entire node to get fenced, kicking the > stool out from under a number of VMs. > > Looking at VirtualDomain it looks like

Re: [ClusterLabs] nova-compute_monitor_10000 on 'node-xxx ' not running

2021-06-25 Thread kgaillot
On Fri, 2021-06-25 at 14:41 +0800, luckydog xf wrote: > 1. deleted recorded failures. > crm_failcount -V -D -r nova-compute -N remote-db8-ca-3a-69-50-34 -n > monitor -I 1 > > 2. cleanup resource status > crm resource cleanup nova-compute remote-db8-ca-3a-69-50-34 force > > Problem resolved.

Re: [ClusterLabs] A systemd resource monitor is still in progress: re-scheduling

2021-06-14 Thread kgaillot
On Sun, 2021-06-13 at 22:43 +0800, Acewind wrote: > Dear guys, > I'm using pacemaker-1.1.20 to construct an openstack HA system. > After I stop the cluster, pcs monitor operation always be in > progress for cinder-volume & cinder-scheduler service. But the > systemd service is active and

Re: [ClusterLabs] Pacemaker 2.1.0 final release now available

2021-06-09 Thread kgaillot
I had generated the docs from a host with older versions of some of the doc tools. I regenerated them from a newer host. Some tables still have issues, but long lines are now wrapped. On Wed, 2021-06-09 at 12:40 -0500, kgail...@redhat.com wrote: > On Wed, 2021-06-09 at 12:29 +0300, Andrei

Re: [ClusterLabs] Pacemaker 2.1.0 final release now available

2021-06-09 Thread kgaillot
On Wed, 2021-06-09 at 12:29 +0300, Andrei Borzenkov wrote: > On Wed, Jun 9, 2021 at 12:24 AM wrote: > > > > Hi all, > > > > Pacemaker 2.1.0 has officially been released, with source code > > available at: > > > > > > https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.1.0 > >

[ClusterLabs] Pacemaker 2.1.0 final release now available

2021-06-08 Thread kgaillot
Hi all, Pacemaker 2.1.0 has officially been released, with source code available at: https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.1.0 Highlights include OCF Resource Agent API 1.1 compatibility, noncritical resources, and new build-time options. The Pacemaker documentation

Re: [ClusterLabs] Colocating a Virtual IP address with multiple resources

2021-06-07 Thread kgaillot
On Mon, 2021-06-07 at 20:37 +, Abithan Kumarasamy wrote: > Hello Team, > > We have been recently experimenting with some resource model options > to fulfil the following scenario. We would like to collocate a > virtual IP resource with multiple db resources. When the virtual IP > fails over

Re: [ClusterLabs] One Failed Resource = Failover the Cluster?

2021-06-07 Thread kgaillot
On Sun, 2021-06-06 at 08:26 +, Strahil Nikolov wrote: > Based on the constraint rules you have mentioned , failure of mysql > should not cause a failover to another node. For better insight, you > have to be able to reproduce the issue and share the logs with the > community. By default,

Re: [ClusterLabs] One Failed Resource = Failover the Cluster?

2021-06-07 Thread kgaillot
On Sat, 2021-06-05 at 20:33 +, Eric Robinson wrote: > > -Original Message- > > From: Users On Behalf Of > > kgail...@redhat.com > > Sent: Friday, June 4, 2021 4:49 PM > > To: Cluster Labs - All topics related to open-source clustering > > welcomed > > > > Subject: Re: [ClusterLabs]

Re: [ClusterLabs] One Failed Resource = Failover the Cluster?

2021-06-04 Thread kgaillot
On Fri, 2021-06-04 at 19:10 +, Eric Robinson wrote: > Sometimes it seems like Pacemaker fails over an entire cluster when > only one resource has failed, even though no other resources are > dependent on it. Is that expected behavior? > > For example, suppose I have the following colocation

[ClusterLabs] Pacemaker 2.1.0-rc3 now available

2021-06-01 Thread kgaillot
Hi all, Source code for the third (and likely final) release candidate for Pacemaker version 2.1.0 is now available at: https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.1.0-rc3 This fixes a couple of minor regressions found in crm_resource and crm_verify in rc1, and makes a few

Re: [ClusterLabs] Antw: [EXT] no-quorum-policy=stop never executed, pacemaker stuck in election/integration, corosync running in "new membership" cycles with itself

2021-06-01 Thread kgaillot
On Tue, 2021-06-01 at 13:18 +0200, Ulrich Windl wrote: > Hi! > > I can't answer, but I doubt the usefulness of "no-quorum- > policy=stop": > If nodes loose quorum, they try to stop all resources, but "remain" > in the > cluster (will respond to network queries (if any arrive). > If one of those

Re: [ClusterLabs] Pacemaker Cluster help

2021-06-01 Thread kgaillot
On Thu, 2021-05-27 at 20:46 +0300, Andrei Borzenkov wrote: > On 27.05.2021 15:36, Nathan Mazarelo wrote: > > Is there a way to have pacemaker resource groups failover if all > > floating IP resources are unavailable? > > > > I want to have multiple floating IPs in a resource group that will >

Re: [ClusterLabs] #clusterlabs IRC channel

2021-05-26 Thread kgaillot
Without further comments, we've gone ahead with Libera.Chat as the new home of #clusterlabs. There is a new wiki page with the channel details: https://wiki.clusterlabs.org/wiki/ClusterLabs_IRC_channel so we can just point to that in documentation and such. That way, the info only needs to be

Re: [ClusterLabs] Antw: [EXT] Coming in Pacemaker 2.1.0: OCF resource agent path

2021-05-20 Thread kgaillot
On Thu, 2021-05-20 at 08:10 +0200, Ulrich Windl wrote: > > > > schrieb am 19.05.2021 um 23:33 in > > > > Nachricht > > <34dfdacadc85a258661a9fcd81b39c153ad1cf02.ca...@redhat.com>: > > Hi all, > > > > We squeezed one more feature into the Pacemaker 2.1.0 release for > > rc2: > > the ability to

Re: [ClusterLabs] Coming in Pacemaker 2.1.0: OCF resource agent path

2021-05-20 Thread kgaillot
On Thu, 2021-05-20 at 07:53 +0300, Andrei Borzenkov wrote: > On 20.05.2021 00:33, kgail...@redhat.com wrote: > > Hi all, > > > > We squeezed one more feature into the Pacemaker 2.1.0 release for > > rc2: > > the ability to search multiple directories for OCF resource agents. > > > > Previously,

[ClusterLabs] Pacemaker 2.1.0-rc2 now available

2021-05-19 Thread kgaillot
Hi all, Source code for the second release candidate for Pacemaker version 2.1.0 is now available at: https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.1.0-rc2 Changes since rc1 include new OCF-related build options discussed in an earlier message, compatibility with

[ClusterLabs] Coming in Pacemaker 2.1.0: OCF resource agent path

2021-05-19 Thread kgaillot
Hi all, We squeezed one more feature into the Pacemaker 2.1.0 release for rc2: the ability to search multiple directories for OCF resource agents. Previously, the OCF root (typically /usr/lib/ocf) could be specified at build time via ./configure --with-ocfdir, and all resource agents had to be

[ClusterLabs] #clusterlabs IRC channel

2021-05-19 Thread kgaillot
Hello all, The ClusterLabs community has long used a #clusterlabs IRC channel on the popular IRC server freenode.net. As you may have heard, freenode recently had a mass exodus of staff and channels after a corporate buy-out that was perceived as threatening the user community's values. Many

Re: [ClusterLabs] Problem with the cluster becoming mostly unresponsive

2021-05-14 Thread kgaillot
On Fri, 2021-05-14 at 15:04 -0400, Digimer wrote: > Hi all, > > I'm run into an issue a couple of times now, and I'm not really > sure > what's causing it. I've got a RHEL 8 cluster that, after a while, > will > show one or more resources as 'FAILED'. When I try to do a cleanup, > it > marks