Re: [ClusterLabs] Pcsd port change after cluster setup
The interesting part is that after repeating the process (update the file, stop & start pcsd and pcs host auth ) everything is working fine including the web UI. Best Regards, Strahil Nikolov On Mon, Apr 15, 2024 at 17:20, Strahil Nikolov via Users wrote: Hi All, I need your help to change the pcsd port.I set the port in /etc/sysconfig/pcsd on all nodes:PCSD_PORT=3500 Yet, the daemon is not listening on it. Best Regards, Strahil Nikolov___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Pcsd port change after cluster setup
Hi All, I need your help to change the pcsd port.I set the port in /etc/sysconfig/pcsd on all nodes:PCSD_PORT=3500 Yet, the daemon is not listening on it. Best Regards, Strahil Nikolov___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Fencing doesn't work with google-cloud-cli
Hi All, I'm sorry for the previous post. Most probably it's not google-cloud-cli as even after downgrading, fencing still doesn't work all the time. Best Regards, Strahil Nikolov В сряда, 27 март 2024 г. в 15:39:06 ч. Гринуич+2, Strahil Nikolov via Users написа: Hi All, I'm starting this thread in order to warn you that if you updated recently and 'google-cloud-cli' rpm was deployed (obsoletes 'google-cloud-sdk'), fencing won't work for you despite that fence_gce and 'pcs stonith fence' report success. The VM stays in a odd status (right now I don't have serial console) and it never comes up until you manually recycle it (using google-cloud-sdk or WEB UI). I suspect that 'gcp-vpc-move-vip' also won't work. In my case it took very long time to move and finally I gave up and moved the resource back to the previous host and "downgraded" the 'google-cloud-cli' package. Best Regards, Strahil Nikolov ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Fencing doesn't work with google-cloud-cli
Hi All, I'm starting this thread in order to warn you that if you updated recently and 'google-cloud-cli' rpm was deployed (obsoletes 'google-cloud-sdk'), fencing won't work for you despite that fence_gce and 'pcs stonith fence' report success. The VM stays in a odd status (right now I don't have serial console) and it never comes up until you manually recycle it (using google-cloud-sdk or WEB UI). I suspect that 'gcp-vpc-move-vip' also won't work. In my case it took very long time to move and finally I gave up and moved the resource back to the previous host and "downgraded" the 'google-cloud-cli' package. Best Regards, Strahil Nikolov ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Questionsabout GCP VIP setup
Hi Oyvind I found your e-mail in my spam folder.It seems 'gcloud-ra' doesn't exits and it's not needed for the fence agent or the gcp-vpc-move-vip. Best Regards,Strahil Nikolov On Wed, Feb 7, 2024 at 13:26, Oyvind Albrigtsen wrote: On 07/02/24 11:15 +, Strahil Nikolov via Users wrote: >Hi All, >This is my first cluster in the cloud and I have 2 questions that I'm hoping >to get a clue. >1. Where I can find the 'gcloud-ra' binary on EL9 system ? I have installed >resource-agents-cloud but I can't find it. You need to install gcloud from Google's repository: https://cloud.google.com/sdk/gcloud >2. Is gcp-vpc-move-vip a good approach to setup the VIP ? It should be, yeah. Run "pcs resource describe gcp-vpc-move-vip" for more info. There's also gcp-vpc-move-route to move a floating IP by changing an entry in the routing table. Oyvind >Best Regards,Strahil Nikolov >___ >Manage your subscription: >https://lists.clusterlabs.org/mailman/listinfo/users > >ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] pcsd web interface not working on EL 9.3
Hi, I didn't see any redirect and I was puzzled.Currently, the firewall is still blocking me and curl-ing it was the only test that came to my mind. Best Regards,Strahil Nikolov On Wed, Feb 21, 2024 at 8:56, Ivan Devat wrote: Hi, the url https://fqdn:2224 redirects to https://fqdn:2224/ui/. You can use curl with --location (or -L - If the server reports that the requested page has moved to a different location (indicated with a Location: header and a 3XX response code), this option will make curl redo the request on the new place...): curl --location https://fqdn:2224 Ivan On Mon, Feb 19, 2024 at 10:16 AM lejeczek via Users wrote: > > > > On 19/02/2024 09:06, Strahil Nikolov via Users wrote: > > Hi All, > > > > Is there a specific setup I missed in order to setup the > > web interface ? > > > > Usually, you just login with the hacluster user on > > https://fqdn:2224 but when I do a curl, I get an empty > > response. > > > > Best Regards, > > Strahil Nikolov > > > > ___ > Was it giving out some stuff before? I've never did curl on > it. Won't guess why do that. > Yes it is _empty_for me too - though I use it with reverse > proxy. > While _main_ URL curl returns no content this does get: > -> $ curl https://pcs-ha.mine.priv/ui/ > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] pcsd web interface not working on EL 9.3
Hi All, Is there a specific setup I missed in order to setup the web interface ? Usually, you just login with the hacluster user on https://fqdn:2224 but when I do a curl, I get an empty response. Best Regards,Strahil Nikolov___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Questionsabout GCP VIP setup
Hi All, This is my first cluster in the cloud and I have 2 questions that I'm hoping to get a clue. 1. Where I can find the 'gcloud-ra' binary on EL9 system ? I have installed resource-agents-cloud but I can't find it. 2. Is gcp-vpc-move-vip a good approach to setup the VIP ? Best Regards,Strahil Nikolov___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] GCP and IP address question
Hello All, I will soon build my first cluster in the cloud and I was wondering if I can still use IPAddr2 resource in GCP or I really have to use ocf:heartbeat:gcp-vpc-move-route & ocf:heartbeat:gcp-vpc-move-vip ? I'm still trying to find a guide, so I can understand the idea behind those resources. Best Regards, Strahil Nikolov ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] [ClusterLabs Developers] How do I install and configure Pacemaker high-availability cluster resource manager?
Also both SuSE and Red Hat documentation is quite extensive and can be considered as a good start. Best Regards,Strahil Nikolov On Wed, Aug 10, 2022 at 18:41, Turritopsis Dohrnii Teo En Ming wrote: On Wed, 10 Aug 2022 at 23:37, Reid Wahl wrote: > > On Wed, Aug 10, 2022 at 8:13 AM Turritopsis Dohrnii Teo En Ming > wrote: > > > > Subject: How do I install and configure Pacemaker high-availability > > cluster resource manager? > > > > Good day from Singapore, > > > > How do I install and configure Pacemaker high-availability cluster > > resource manager? > > Redirecting to users list. > > I would start with the Pacemaker documentation -- especially Clusters > from Scratch. Let us know if anything is unclear. > > Docs: > - Clusters from Scratch > (https://www.clusterlabs.org/pacemaker/doc/2.1/Clusters_from_Scratch/html/) > - Pacemaker Administration > (https://www.clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Administration/html/) > - Pacemaker Explained > (https://www.clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/) > Dear Reid Wahl, Thank you for the links. Regards, Mr. Turritopsis Dohrnii Teo En Ming Targeted Individual in Singapore ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Heads up for ldirectord in SLES12 SP5 "Use of uninitialized value $ip_port in pattern match (m//) at /usr/sbin/ldirectord line 1830"
My rule number 1 is to put all DNS entries in /etc/hosts or use dnsmasq for local DNS caching.Rule number 2 , add cluster nodes as ntp/chrony peers (with 'prefer' for the ntp servers) to avoid node drift if time source is down for a long time. Should the cluster take care of unstable infra -> not without you explicitly asking for that (and being capable of). Best Regards,Strahil Nikolov On Thu, Aug 4, 2022 at 9:38, Ulrich Windl wrote: Hi! FYI, here is a copy what I had sent to SUSE support (stating "Because of the very same DNS resolution problem, stopping also failed"; should a temporary DNS resolving problem cause a resource stop to fail and cause node fencing in turn? I don't think so!): --- The problem is the Perl code that most likely was never tested to handle a failure of or in ld_gethostservbyname(): FIRST it should be checked whether a value was returned at all; if not there is a failure in resolution. In turn a failure in resolution could mean two tings: 1) The names in the configuration are not correct and will never resolve. 2) A temporary failure of some kind caused a failure and the configuration IS CORRECT. Clearly the bad case here was 2). Also looking at the code I wonder why it does not handle things like this: $ip_port=_gethostservbyname($ip_port, $vsrv->{protocol}, $af); if ($ip_port) { if ($ip_port =~ /^(.+):([^:]+)$/) { # replacing the split ($vsrv->{server}, $vsrv->{port}) = ($1, $2); # this should also handle the case "$ip_port =~ /(\[[0-9A-Fa-f:]+\]):(\d+)/" } else { # error "unexpected return from ld_gethostservbyname" } } else { # error "failed to resolve ..." # here it's unfortunate that the original $ip_port is lost, # so it cannot be part of the error message } Despite of that is that the critical part was that the "stop" operation SEEMED to have failed, causing fencing. Regardless of the success of resolving the names ldirector should be able to stop! --- Opinions? Regards, Ulrich >>> "Ulrich Windl" schrieb am 03.08.2022 um 11:13 in Nachricht <62ea3c2c02a10004c...@gwsmtp.uni-regensburg.de>: > Hi! > > I wanted to inform you of an unpleasant bug in ldirectord of SLES12 SP5: > We had a short network problem while some redundancy paths reconfigured in > the infrastructure, effectively causing that some network services could not > be reached. > Unfortunately ldirectord controlled by the cluster reported a failure (the > director, not the services being directed to): > > h11 crmd[28930]: notice: h11‑prm_lvs_mail_monitor_30:369 [ Use of > uninitialized value $ip_port in pattern match (m//) at /usr/sbin/ldirectord > line 1830, line 21. Error [33159] reading file > /etc/ldirectord/mail.conf at line 10: invalid address for virtual service\n ] > h11 ldirectord[33266]: Exiting with exit_status 2: config_error: > Configuration Error > > You can guess wat happened: > Pacemaker tried to recover (stop, then start), but the stop failed, too: > h11 lrmd[28927]: notice: prm_lvs_mail_stop_0:35047:stderr [ Use of > uninitialized value $ip_port in pattern match (m//) at /usr/sbin/ldirectord > line 1830, line 21. ] > h11 lrmd[28927]: notice: prm_lvs_mail_stop_0:35047:stderr [ Error [36293] > reading file /etc/ldirectord/mail.conf at line 10: invalid address for > virtual service ] > h11 crmd[28930]: notice: Result of stop operation for prm_lvs_mail on h11: > 1 (unknown error) > > A stop failure meant that the node was fenced, interrupting all the other > services. > > Examining the logs I also found this interesting type of error: > h11 attrd[28928]: notice: Cannot update > fail‑count‑prm_lvs_rksapds5#monitor_30[monitor]=(null) because peer > UUID not known (will retry if learned) > > Eventually, here's the code that caused the error: > > sub _ld_read_config_virtual_resolve > { > my($line, $vsrv, $ip_port, $af)=(@_); > > if($ip_port){ > $ip_port=_gethostservbyname($ip_port, $vsrv‑>{protocol}, > $af); > if ($ip_port =~ /(\[[0‑9A‑Fa‑f:]+\]):(\d+)/) { > $vsrv‑>{server} = $1; > $vsrv‑>{port} = $2; > } elsif($ip_port){ > ($vsrv‑>{server}, $vsrv‑>{port}) = split /:/, $ip_port; > } > else { > _error($line, > "invalid address for virtual service"); > } > ... > > The value returned b
Re: [ClusterLabs] IPaddr2 resource times out and cant be killed
In clouds you can't just use VIPs.Use azure-lb resource instead. Best Regards,Strahil Nikolov On Fri, Jul 29, 2022 at 23:21, Reid Wahl wrote: On Fri, Jul 29, 2022 at 1:02 PM Reid Wahl wrote: > > On Fri, Jul 29, 2022 at 12:52 PM Ross Sponholtz > wrote: > > > > I’m running a RHEL pacemaker cluster on Azure, and I’ve gotten a failure & > > fencing where I get these messages in the log file: > > > > > > warning: vip_ABC_30_monitor_1 process (PID 1779737) timed out > > crit: vip_ABC_30_monitor_1 process (PID 1779737) will not die! > > > > > > > > This resource uses the IPAddr2 resource agent. I’ve looked at the agent > > code, and I can’t pinpoint any reason it would hang up, and since the node > > gets fenced, I can’t tell why this happens – any ideas on what kinds of > > failures could cause this problem? > > > > > > > > Thanks, > > > > Ross > > > > Are you able to reproduce this? I suggest adding `trace_ra=1` to the > resource configuration in order to determine where it's hanging. > > # pcs resource update vip_ABC trace_ra=1 > > This will produce a shell trace of each operation in > /var/lib/heartbeat/trace_ra/IPaddr2. This is naturally quite a lot of > logging, so remove the option when you've gotten what you need. > > # pcs resource update vip_ABC trace_ra= > > Also discussed in this article (you should have access if you're on RHEL): > - How can I determine exactly what is happening with every operation > on a resource in Pacemaker? > (https://access.redhat.com/solutions/3182931) You may also want to set on-fail=block for the stop operation to prevent the node from getting fenced while you troubleshoot this. # pcs resource update vip_ABC op stop interval=0s timeout= on-fail=block Other than that, trace_ra=1 will generally tell us quite a lot -- I just hope that it _does_ get written, given that the child process becomes unkillable. The IPaddr2 resource agent doesn't do all that much. It runs a few `ip` commands and sends an ARP refresh. That's about it. Generally would not expect any of those to hang unless there's a deeper issue. > > > ___ > > Manage your subscription: > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > ClusterLabs home: https://www.clusterlabs.org/ > > > > -- > Regards, > > Reid Wahl (He/Him) > Senior Software Engineer, Red Hat > RHEL High Availability - Pacemaker -- Regards, Reid Wahl (He/Him) Senior Software Engineer, Red Hat RHEL High Availability - Pacemaker ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Fencing for quorum device?
Well, you can always make a single-node cluster with the quorum device's host and setup systemd resource to keep the service up and running.With SBD, that single-node cluster will suicide in case the machine ends in a unresponsive state. Best Regards,Strahil Nikolov On Fri, Jul 15, 2022 at 17:17, Andrei Borzenkov wrote: On 15.07.2022 09:24, Viet Nguyen wrote: > Hi, > > I just wonder that do we need to have fencing for a quorum device? I have 2 > node cluster with one quorum device. Both 2 nodes have fencing agents. > > But I wonder that should i define the fencing agent for quorum device or > not? You cannot. Quorum device is not part of pacemaker cluster, so there is no way to initiate fencing of quorum device. > Just in case it is laggy... > > Thank you so much! > > Regards, > Viet > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Help understanding recover of promotable resource after a "pcs cluster stop --all"
Have you checked with drbd commands if the 2 nodes were in sync? Also consider adding the shared dir, lvm,etc into a single group -> see https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/high_availability_add-on_administration/s1-resourcegroupcreatenfs-haaa Best Regards,Strahil Nikolov On Tue, May 3, 2022 at 0:25, Ken Gaillot wrote: On Mon, 2022-05-02 at 13:11 -0300, Salatiel Filho wrote: > Hi, Ken, here is the info you asked for. > > > # pcs constraint > Location Constraints: > Resource: fence-server1 > Disabled on: > Node: server1 (score:-INFINITY) > Resource: fence-server2 > Disabled on: > Node: server2 (score:-INFINITY) > Ordering Constraints: > promote DRBDData-clone then start nfs (kind:Mandatory) > Colocation Constraints: > nfs with DRBDData-clone (score:INFINITY) (rsc-role:Started) > (with-rsc-role:Master) > Ticket Constraints: > > # sudo crm_mon -1A > ... > Node Attributes: > * Node: server2: > * master-DRBDData : 1 In the scenario you described, only server1 is up. If there is no master score for server1, it cannot be master. It's up the resource agent to set it. I'm not familiar enough with that agent to know why it might not. > > > > Atenciosamente/Kind regards, > Salatiel > > On Mon, May 2, 2022 at 12:26 PM Ken Gaillot > wrote: > > On Mon, 2022-05-02 at 09:58 -0300, Salatiel Filho wrote: > > > Hi, I am trying to understand the recovering process of a > > > promotable > > > resource after "pcs cluster stop --all" and shutdown of both > > > nodes. > > > I have a two nodes + qdevice quorum with a DRBD resource. > > > > > > This is a summary of the resources before my test. Everything is > > > working just fine and server2 is the master of DRBD. > > > > > > * fence-server1 (stonith:fence_vmware_rest): Started > > > server2 > > > * fence-server2 (stonith:fence_vmware_rest): Started > > > server1 > > > * Clone Set: DRBDData-clone [DRBDData] (promotable): > > > * Masters: [ server2 ] > > > * Slaves: [ server1 ] > > > * Resource Group: nfs: > > > * drbd_fs (ocf::heartbeat:Filesystem): Started server2 > > > > > > > > > > > > then I issue "pcs cluster stop --all". The cluster will be > > > stopped on > > > both nodes as expected. > > > Now I restart server1( previously the slave ) and poweroff > > > server2 ( > > > previously the master ). When server1 restarts it will fence > > > server2 > > > and I can see that server2 is starting on vcenter, but I just > > > pressed > > > any key on grub to make sure the server2 would not restart, > > > instead > > > it > > > would just be "paused" on grub screen. > > > > > > SSH'ing to server1 and running pcs status I get: > > > > > > Cluster name: cluster1 > > > Cluster Summary: > > > * Stack: corosync > > > * Current DC: server1 (version 2.1.0-8.el8-7c3f660707) - > > > partition > > > with quorum > > > * Last updated: Mon May 2 09:52:03 2022 > > > * Last change: Mon May 2 09:39:22 2022 by root via cibadmin > > > on > > > server1 > > > * 2 nodes configured > > > * 11 resource instances configured > > > > > > Node List: > > > * Online: [ server1 ] > > > * OFFLINE: [ server2 ] > > > > > > Full List of Resources: > > > * fence-server1 (stonith:fence_vmware_rest): Stopped > > > * fence-server2 (stonith:fence_vmware_rest): Started > > > server1 > > > * Clone Set: DRBDData-clone [DRBDData] (promotable): > > > * Slaves: [ server1 ] > > > * Stopped: [ server2 ] > > > * Resource Group: nfs: > > > * drbd_fs (ocf::heartbeat:Filesystem): Stopped > > > > > > > > > So I can see there is quorum, but the server1 is never promoted > > > as > > > DRBD master, so the remaining resources will be stopped until > > > server2 > > > is back. > > > 1) What do I need to do to force the promotion and recover > > > without > > > restarting server2? > > > 2) Why if instead of rebooting server1 and power off server2 I > > > reboot > > > server2 and poweroff server1 the cluster can recover by itself? > > > > > > > > > Thanks! > > > > > > > You shouldn't need to force promotion, that is the default behavior > > in > > that situation. There must be something else in the configuration > > that > > is preventing promotion. > > > > The DRBD resource agent should set a promotion score for the node. > > You > > can run "crm_mon -1A" to show all node attributes; there should be > > one > > like "master-DRBDData" for the active node. > > > > You can also show the constraints in the cluster to see if there is > > anything relevant to the master role. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] OT: Linstor/DRBD Problem
Why do you use Linstor and not DRBD ?As far as I know Linstor is more suitable for Kubernetes/Openshift . Best Regards,Strahil Nikolov On Thu, Apr 28, 2022 at 8:19, Eric Robinson wrote: This is probably off-topic but I’ll try anyway. Do we have any Linstor gurus around here? I’ve read through the Linstor User Guide and all the help screens, but I don’t see an answer to this question. We added a new physical drive to each of our cluster nodes and extended the LVM volume groups. The VGs now show the correct size as expected. However, in Linstor, the storage pools still look the same and do not reflect the additional storage space. Any ideas what I should check next? If this is too off topic, I’ll understand. -Eric Disclaimer : This email and any files transmitted with it are confidential and intended solely for intended recipients. If you are not the named addressee you should not disseminate, distribute, copy or alter this email. Any views or opinions presented in this email are solely those of the author and might not represent those of Physician Select Management. Warning: Although Physician Select Management has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments.___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] How many nodes redhat cluster does supports
What is the output of 'gfs2_edit -p jindex /dev/shared_vg1/shared_lv1 |grep journal Source: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/configuring_gfs2_file_systems#proc_adding-gfs2-journal-creating-mounting-gfs2 Best Regards,Strahil Nikolov On Wed, Apr 27, 2022 at 22:26, Umar Draz wrote: Hi I am running a 3 nodes cluster on my AWS vms where I plan to use 3 nodes for my websites. Now the issue is only 2 nodes at a time can mount the lvm not all the 3 nodes. Here is the pcs status output. [root@g2fs-1 ~]# pcs status --full Cluster name: wp-cluster Cluster Summary: * Stack: corosync * Current DC: g2fs-1 (1) (version 2.1.2-4.el8-ada5c3b36e2) - partition with quorum * Last updated: Wed Apr 27 19:12:48 2022 * Last change: Tue Apr 26 01:07:34 2022 by root via cibadmin on g2fs-1 * 3 nodes configured * 13 resource instances configured Node List: * Online: [ g2fs-1 (1) g2fs-2 (2) g2fs-3 (3) ] Full List of Resources: * Clone Set: locking-clone [locking]: * Resource Group: locking:0: * dlm (ocf::pacemaker:controld): Started g2fs-1 * lvmlockd (ocf::heartbeat:lvmlockd): Started g2fs-1 * Resource Group: locking:1: * dlm (ocf::pacemaker:controld): Started g2fs-3 * lvmlockd (ocf::heartbeat:lvmlockd): Started g2fs-3 * Resource Group: locking:2: * dlm (ocf::pacemaker:controld): Started g2fs-2 * lvmlockd (ocf::heartbeat:lvmlockd): Started g2fs-2 * Clone Set: shared_vg1-clone [shared_vg1]: * Resource Group: shared_vg1:0: * sharedlv1 (ocf::heartbeat:LVM-activate): Started g2fs-3 * sharedfs1 (ocf::heartbeat:Filesystem): Started g2fs-3 * Resource Group: shared_vg1:1: * sharedlv1 (ocf::heartbeat:LVM-activate): Started g2fs-2 * sharedfs1 (ocf::heartbeat:Filesystem): Started g2fs-2 * Resource Group: shared_vg1:2: * sharedlv1 (ocf::heartbeat:LVM-activate): Stopped * sharedfs1 (ocf::heartbeat:Filesystem): Stopped * wpfence (stonith:fence_aws): Started g2fs-1 Migration Summary: * Node: g2fs-1 (1): * sharedfs1: migration-threshold=100 fail-count=100 last-failure='Tue Apr 26 01:07:46 2022' Failed Resource Actions: * sharedfs1_start_0 on g2fs-1 'error' (1): call=158, status='complete', exitreason='Couldn't mount device [/dev/shared_vg1/shared_lv1] as /mnt/webgfs', last-rc-change='Tue Apr 26 01:07:45 2022', queued=0ms, exec=806ms Tickets: PCSD Status: g2fs-1: Online g2fs-2: Online g2fs-3: Online Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled [root@g2fs-1 ~]# Now if I just stop g2fs-2 or g2fs-3 then node g2fs-1 successfully mount the lvm volume, but if I again power on g2fs-3 then g2fs-3 will not mount lvm volume until I shutdown either g2fs-2 or g2fs-1. Here is resource config [root@g2fs-1 ~]# pcs resource config Clone: locking-clone Meta Attrs: interleave=true Group: locking Resource: dlm (class=ocf provider=pacemaker type=controld) Operations: monitor interval=30s on-fail=fence (dlm-monitor-interval-30s) start interval=0s timeout=90s (dlm-start-interval-0s) stop interval=0s timeout=100s (dlm-stop-interval-0s) Resource: lvmlockd (class=ocf provider=heartbeat type=lvmlockd) Operations: monitor interval=30s on-fail=fence (lvmlockd-monitor-interval-30s) start interval=0s timeout=90s (lvmlockd-start-interval-0s) stop interval=0s timeout=90s (lvmlockd-stop-interval-0s) Clone: shared_vg1-clone Meta Attrs: interleave=true Group: shared_vg1 Resource: sharedlv1 (class=ocf provider=heartbeat type=LVM-activate) Attributes: activation_mode=shared lvname=shared_lv1 vg_access_mode=lvmlockd vgname=shared_vg1 Operations: monitor interval=30s timeout=90s (sharedlv1-monitor-interval-30s) start interval=0s timeout=90s (sharedlv1-start-interval-0s) stop interval=0s timeout=90s (sharedlv1-stop-interval-0s) Resource: sharedfs1 (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/shared_vg1/shared_lv1 directory=/mnt/webgfs fstype=gfs2 options=noatime Operations: monitor interval=10s on-fail=fence (sharedfs1-monitor-interval-10s) start interval=0s timeout=60s (sharedfs1-start-interval-0s) stop interval=0s timeout=60s (sharedfs1-stop-interval-0s) [root@g2fs-1 ~]# Here is the stonith config Resource: wpfence (class=stonith type=fence_aws) Attributes: access_key=AKIA5CLSSLOEXEKUNMXI pcmk_host_map=g2fs-1:i-021b24d1343c1d5ea;g2fs-2:i-0015e5229b0139462;g2fs-3:i-0381c42de4515696f pcmk_reboot_retries=4 pcmk_reboot_timeout=480 power_timeout=240 region=us-east-1 secret_key=IWKug66AZwb/q7PM00bJb2QtGlfceumdz3eO8TIF Operations: monitor interval=60s (wpfence-monitor-interval-60s) So the question is that redhat
Re: [ClusterLabs] OCF_TIMEOUT - Does it recover by itself?
You can use a meta attribute to expire failures . The attribute name is 'failure-timeout'I have used it for my fencing devices as during the night the network was quite busy. Best Regards,Strahil Nikolov On Tue, Apr 26, 2022 at 23:54, Hayden, Robert via Users wrote: Robert Hayden | Lead Technology Architect | Cerner Corporation | 816.201.4068 | rhay...@cerner.com | www.cerner.com > -Original Message- > From: Users On Behalf Of Ken Gaillot > Sent: Tuesday, April 26, 2022 2:25 PM > To: Cluster Labs - All topics related to open-source clustering welcomed > > Subject: Re: [ClusterLabs] OCF_TIMEOUT - Does it recover by itself? > > On Tue, 2022-04-26 at 15:20 -0300, Salatiel Filho wrote: > > I have a question about OCF_TIMEOUT. Some times my cluster shows me > > this on pcs status: > > Failed Resource Actions: > > * fence-server02_monitor_6 on server01 'OCF_TIMEOUT' (198): > > call=419, status='Timed Out', exitreason='', > > last-rc-change='2022-04-26 14:47:32 -03:00', queued=0ms, exec=20004ms > > > > I can see in the same pcs status output that the fence device is > > started, so does that mean it failed some moment in the past and now > > it is OK? Or do I have to do something to recover it? > > Correct, the status shows failures that have happened in the past. The > cluster tries to recover failed resources automatically according to > whatever policy has been configured (the default being to stop and > start the resource). > > Since the resource is shown as active, there's nothing you have to do. > You can investigate the timeout (for example look at the system logs > around that timestamp to see if anything else unusual was reported), > and you can clear the failure from the status display with > "crm_resource --cleanup" (or "pcs resource cleanup"). > FYI - I have had some issues with "pcs resource cleanup" and on past events where it decided restart my already recovered and running resources throwing me into another short outage. Also seen past, but recovered failures cause issues with future events where nodes are coming out of maintenance mode (times when the cluster is reviewing states of resources and see a past failure, but not recognizing it was already recovered). This was mainly on RHEL/OL 7 clusters. Since people don't like to see failures on the "pcs status" output, I have moved to using the following to automatically clear resource failures after 1 week's time. pcs resource defaults failure-timeout=604800 Gives people a chance to investigate a past failure, but they fall off the cluster's radar. > > > > # pcs status > > Cluster name: cluster1 > > Cluster Summary: > > * Stack: corosync > > * Current DC: server02 (version 2.1.0-8.el8-7c3f660707) - partition > > with quorum > > * Last updated: Tue Apr 26 14:52:56 2022 > > * Last change: Tue Apr 26 14:37:22 2022 by hacluster via crmd on > > server01 > > * 2 nodes configured > > * 11 resource instances configured > > > > Node List: > > * Online: [ server01 server02 ] > > > > Full List of Resources: > > * fence-server01 (stonith:fence_vmware_rest): Started > > server02 > > * fence-server02 (stonith:fence_vmware_rest): Started > > server01 > > ... > > > > Is "pcs resource cleanup" the right way to remove those messages ? > > > > > > > > > > Atenciosamente/Kind regards, > > Salatiel > -- > Ken Gaillot > > ___ > Manage your subscription: > https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists. > clusterlabs.org%2Fmailman%2Flistinfo%2Fusersdata=05%7C01%7Crha > yden%40cerner.com%7C96253b7f767848073dcb08da27ba6e9b%7Cfbc493a80 > d244454a815f4ca58e8c09d%7C0%7C0%7C637865978923341094%7CUnknown > %7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1ha > WwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=EPiNm1sfkHccbXEa > 14EmuIot5jWii53Nk5KtrdKQk9Y%3Dreserved=0 > > ClusterLabs home: > https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww > w.clusterlabs.org%2Fdata=05%7C01%7Crhayden%40cerner.com%7C9 > 6253b7f767848073dcb08da27ba6e9b%7Cfbc493a80d244454a815f4ca58e8c09d > %7C0%7C0%7C637865978923341094%7CUnknown%7CTWFpbGZsb3d8eyJWIj > oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3 > 000%7C%7C%7Csdata=yH1vGXlaWOfuu3q0aTxDfuonpC2XFzbwYpz7ea > UrwzA%3Dreserved=0 CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation and are intended only for the addressee. The information contained in this message is confidential and may constitute inside or non-public infor
Re: [ClusterLabs] I_DC_TIMEOUT and node fenced when it joins the cluster
Set the corosync token to 1 miliseconds and adjust the consensus as per the man 5 corosync.conf and give it a try. Don't forget to sync the corosync settings among the cluster. Best Regards,Strahil Nikolov On Fri, Apr 15, 2022 at 15:27, vitaly wrote: Hello Everybody. I am seeing occasionally the following behavior on two node cluster. 1. Abruptly rebooting both nodes of the cluster (using "reboot") 2. Both nodes start to come up. Node d18-3-left (2) comes up first Apr 13 23:56:09 d18-3-left corosync[11465]: [MAIN ] Corosync Cluster Engine ('2.4.4'): started and ready to provide service. 3. Second node d18-3-right (1) joins the cluster Apr 13 23:56:58 d18-3-left corosync[11466]: [TOTEM ] A new membership (172.16.1.1:60) was formed. Members joined: 1 Apr 13 23:56:58 d18-3-left corosync[11466]: [QUORUM] This node is within the primary component and will provide service. Apr 13 23:56:58 d18-3-left corosync[11466]: [QUORUM] Members[2]: 1 2 Apr 13 23:56:58 d18-3-left corosync[11466]: [MAIN ] Completed service synchronization, ready to provide service. Apr 13 23:56:58 d18-3-left pacemakerd[11717]: notice: Quorum acquired Apr 13 23:56:58 d18-3-left crmd[11763]: notice: Quorum acquired 4. 2 seconds later node d18-3-left shows I_DC_TIMEOUT and starts fencing of the newly joined node. Apr 13 23:57:00 d18-3-left crmd[11763]: warning: Input I_DC_TIMEOUT received in state S_PENDING from crm_timer_popped After that we get: Apr 13 23:57:00 d18-3-left crmd[11763]: notice: State transition S_ELECTION -> S_INTEGRATION Apr 13 23:57:00 d18-3-left crmd[11763]: warning: Input I_ELECTION_DC received in state S_INTEGRATION from do_election_check and fence the node: Apr 13 23:57:01 d18-3-left pengine[11762]: warning: Scheduling Node d18-3-right.lab.archivas.com for STONITH Apr 13 23:57:01 d18-3-left pengine[11762]: notice: * Fence (reboot) d18-3-right.lab.archivas.com 'node is unclean' 5. After this the node that was fenced comes up again and joins the cluster without any issues. Any idea on what is going on here? Thanks, _Vitaly ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: SAP HANA monitor fails ‑ Error performing operation: No such device or address
It's not like that. Let's assume you have a resource Dummy1 with prefference to nodeA (score 10).If your stickiness is 20 - the resource will not fall back to nodeA (after a failure) when it returns as nodeA = 10, current node = 20 (due to stickiness. If your stickiness is '1' , while nodeA has a score of 10 - once the node joins, Dummy1 will move again to nodeA as the score will be:nodeA = 10, current node = 1 due to stickiness. Keep in mind that for groups, all resources' score sum up before evaluation. Best Regards,Strahil Nikolov On Mon, Apr 11, 2022 at 9:01, Ulrich Windl wrote: >>> Ken Gaillot schrieb am 08.04.2022 um 15:38 in Nachricht <7f3a5e59a3b66f520fa549c1151473efbf0fd980.ca...@redhat.com>: ... > I'm not familiar enough with SAP to speak to that side of things, but > the behavior after clean‑up is normal. If you don't want resources to > go back to their preferred node after a failure is cleaned up, set the > resource‑stickiness meta‑attribute to a positive number (either on the > resource itself, or in resource defaults if you want it to apply to > everything). ... Unfortunately the value to set for stickiness is some what balck magic: It seems it does not make a real difference whether you set it to 5, 10, 100, or 1000; once grater than 0 it prevents resource migration, even when a placement strategy suggests that the placement is non-optimal (e.g. after a node restart). Regards, Ulrich ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] SAP HANA monitor fails ‑ Error performing operation: No such device or address
debug start is doing the described in https://wiki.clusterlabs.org/wiki/Debugging_Resource_Failures Best Regards,Strahil Nikolov On Mon, Apr 11, 2022 at 7:21, Aj Revelino wrote: Hi Strahil, Yes I went through the documentation from Azure. In fact, we have 6 production clusters running on SLES 15 sp but none of them are using the hook. SAPHanaSR-showAttr shows SFAIL for the replication status. But this output is read from the CIB where the replication attribute is set as 'SFAIL' by pacemaker. I think, like you mentioned, the hook should be able to resolve it. I'll try this weekend. What I fail to understand is this error msg 'hanapopdb1-rsc_SAPHana_HPN_HDB00_monitor_6:195 [ Error performing operation: No such device or address]* Did you mean to trace the resource? If so, i'll be doing it this weekend Regards,Aj On Sat, Apr 9, 2022 at 6:49 PM Strahil Nikolov wrote: You can use pcs resource debug-start, but you have to shut it down before that. Have you used some documentation for the setup ? Usually I reffer to the vendor's documentation. Go over it and check for a step that was not implemented. RH's latest version is: https://access.redhat.com/sites/default/files/attachments/v10_ha_solution_for_sap_hana_scale_out_system_replication_0.pdf https://access.redhat.com/articles/3004101 SLES:https://documentation.suse.com/sbp/all/html/SLES4SAP-hana-scaleOut-PerfOpt-15/index.html https://documentation.suse.com/sbp/all/single-html/SLES4SAP-hana-sr-guide-PerfOpt-15/index.html Based on my experience, the most critical component is the hook setup, so the cluster can properly identify replication status. Also, unmanaged resources do not probe for replication status and thus the cluster never identifies if replication is restored until the resource is again 'managed'. When removing maintenance, it's always nice to 'crm_simulate' . One very good article is https://www.suse.com/support/kb/doc/?id=19158 . What is the output of SAPHanaSR-showAttr ? Best Regards,Strahil Nikolov On Sat, Apr 9, 2022 at 0:27, Aj Revelino wrote: ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] SAP HANA monitor fails ‑ Error performing operation: No such device or address
You can use pcs resource debug-start, but you have to shut it down before that. Have you used some documentation for the setup ? Usually I reffer to the vendor's documentation. Go over it and check for a step that was not implemented. RH's latest version is: https://access.redhat.com/sites/default/files/attachments/v10_ha_solution_for_sap_hana_scale_out_system_replication_0.pdf https://access.redhat.com/articles/3004101 SLES:https://documentation.suse.com/sbp/all/html/SLES4SAP-hana-scaleOut-PerfOpt-15/index.html https://documentation.suse.com/sbp/all/single-html/SLES4SAP-hana-sr-guide-PerfOpt-15/index.html Based on my experience, the most critical component is the hook setup, so the cluster can properly identify replication status. Also, unmanaged resources do not probe for replication status and thus the cluster never identifies if replication is restored until the resource is again 'managed'. When removing maintenance, it's always nice to 'crm_simulate' . One very good article is https://www.suse.com/support/kb/doc/?id=19158 . What is the output of SAPHanaSR-showAttr ? Best Regards,Strahil Nikolov On Sat, Apr 9, 2022 at 0:27, Aj Revelino wrote: ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Restarting parent of ordered clone resources on specific node causes restart of all resources in the ordering constraint on all nodes of the cluster
You can use 'kind' and 'symmetrical' to control order constraints. The default value for symmetrical is 'true' which means that in order to stop dummy1 , the cluster has to stop dummy1 & dummy2. Best Regards,Strahil Nikolov On Fri, Apr 8, 2022 at 15:29, ChittaNagaraj, Raghav wrote: Hello Team, Hope you are doing well. I have a 4 node pacemaker cluster where I created clone dummy resources test-1, test-2 and test-3 below: $ sudo pcs resource create test-1 ocf:heartbeat:Dummy op monitor timeout="20" interval="10" clone $ sudo pcs resource create test-2 ocf:heartbeat:Dummy op monitor timeout="20" interval="10" clone $ sudo pcs resource create test-3 ocf:heartbeat:Dummy op monitor timeout="20" interval="10" clone Then I ordered them so test-2-clone starts after test-1-clone and test-3-clone starts after test-2-clone: $ sudo pcs constraint order test-1-clone then test-2-clone Adding test-1-clone test-2-clone (kind: Mandatory) (Options: first-action=start then-action=start) $ sudo pcs constraint order test-2-clone then test-3-clone Adding test-2-clone test-3-clone (kind: Mandatory) (Options: first-action=start then-action=start) Here are my clone sets(snippet of "pcs status" output pasted below): * Clone Set: test-1-clone [test-1]: * Started: [ node2_a node2_b node1_a node1_b ] * Clone Set: test-2-clone [test-2]: * Started: [ node2_a node2_b node1_a node1_b ] * Clone Set: test-3-clone [test-3]: * Started: [ node2_a node2_b node1_a node1_b ] Then I restart test-1 on just node1_a: $ sudo pcs resource restart test-1 node1_a Warning: using test-1-clone... (if a resource is a clone, master/slave or bundle you must use the clone, master/slave or bundle name) test-1-clone successfully restarted This causes test-2 and test-3 clones to restart on all pacemaker nodes when my intention is for them to restart on just node1_a. Below is the log tracing seen on the Designated Controller NODE1-B: Apr 07 20:25:01 NODE1-B pacemaker-schedulerd[95746]: notice: * Stop test-1:1 ( node1_a ) due to node availability Apr 07 20:25:03 NODE1-B pacemaker-schedulerd[95746]: notice: * Restart test-2:0 ( node1_b ) due to required test-1-clone running Apr 07 20:25:03 NODE1-B pacemaker-schedulerd[95746]: notice: * Restart test-2:1 ( node1_a ) due to required test-1-clone running Apr 07 20:25:03 NODE1-B pacemaker-schedulerd[95746]: notice: * Restart test-2:2 ( node2_b ) due to required test-1-clone running Apr 07 20:25:03 NODE1-B pacemaker-schedulerd[95746]: notice: * Restart test-2:3 ( node2_a ) due to required test-1-clone running Above is a representation of the observed behavior using dummy resources. Is this the expected behavior of cloned resources? My goal is to be able to restart test-2-clone and test-3-clone on just the node that experienced test-1 restart rather than all other nodes in the cluster. Please let us know if any additional information will help for you to be able to provide feedback. Thanks for your help! - Raghav Internal Use - Confidential ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Failed migration causing fencing loop
What about if you disable the enable-startup-probes at fencing (custom fencing that sets it to false and fails, so the next fencing device in the topology kicks in)? When the node joins , it will skip startup probes and later a systemd service or some script check if all nodes were up for at least 15-20 min and enable it back ? Best Regards,Strahil Nikolov On Thu, Mar 31, 2022 at 14:02, Ulrich Windl wrote: >>> "Gao,Yan" schrieb am 31.03.2022 um 11:18 in Nachricht <67785c2f-f875-cb16-608b-77d63d9b0...@suse.com>: > On 2022/3/31 9:03, Ulrich Windl wrote: >> Hi! >> >> I just wanted to point out one thing that hit us with SLES15 SP3: >> Some failed live VM migration causing node fencing resulted in a fencing > loop, because of two reasons: >> >> 1) Pacemaker thinks that even _after_ fencing there is some migration to > "clean up". Pacemaker treats the situation as if the VM is running on both > nodes, thus (50% chance?) trying to stop the VM on the node that just booted > after fencing. That's supid but shouldn't be fatal IF there weren't... >> >> 2) The stop operation of the VM (that atually isn't running) fails, > > AFAICT it could not connect to the hypervisor, but the logic in the RA > is kind of arguable that the probe (monitor) of the VM returned "not > running", but the stop right after that returned failure... > > OTOH, the point about pacemaker is the stop of the resource on the > fenced and rejoined node is not really necessary. There has been > discussions about this here and we are trying to figure out a solution > for it: > > https://github.com/ClusterLabs/pacemaker/pull/2146#discussion_r828204919 > > For now it requires administrator's intervene if the situation happens: > 1) Fix the access to hypervisor before the fenced node rejoins. Thanks for the explanation! Unfortunately this can be tricky if libvirtd is involved (as it is here): libvird uses locking (virtlockd), which in turn needs a cluster-wird filesystem for locks across the nodes. When that filesystem is provided by the cluster, it's hard to delay node joining until filesystem, virtlockd and libvirtd are running. (The issue had been discussed before: It does not make sense to run some probes when those probes need other resources to detect the status. With just a Boolean status return at best all those probes could say "not running". Ideally a third status like "please try again some later time" would be needed, or probes should follow the dependencies of their resources (which may open another can of worms). Regards, Ulrich > 2) Manually cleanup the resource, which tells pacemaker it can safely > forget the historical migrate_to failure. > > Regards, > Yan > >> causing a node fence. So the loop is complete. >> >> Some details (many unrelated messages left out): >> >> Mar 30 16:06:14 h16 libvirtd[13637]: internal error: libxenlight failed to > restore domain 'v15' >> >> Mar 30 16:06:15 h19 pacemaker-schedulerd[7350]: warning: Unexpected result > (error: v15: live migration to h16 failed: 1) was recorded for migrate_to of > prm_xen_v15 on h18 at Mar 30 16:06:13 2022 >> >> Mar 30 16:13:37 h19 pacemaker-schedulerd[7350]: warning: Unexpected result > (OCF_TIMEOUT) was recorded for stop of prm_libvirtd:0 on h18 at Mar 30 > 16:13:36 2022 >> Mar 30 16:13:37 h19 pacemaker-schedulerd[7350]: warning: Unexpected result > (OCF_TIMEOUT) was recorded for stop of prm_libvirtd:0 on h18 at Mar 30 > 16:13:36 2022 >> Mar 30 16:13:37 h19 pacemaker-schedulerd[7350]: warning: Cluster node h18 > will be fenced: prm_libvirtd:0 failed there >> >> Mar 30 16:19:00 h19 pacemaker-schedulerd[7350]: warning: Unexpected result > (error: v15: live migration to h18 failed: 1) was recorded for migrate_to of > prm_xen_v15 on h16 at Mar 29 23:58:40 2022 >> Mar 30 16:19:00 h19 pacemaker-schedulerd[7350]: error: Resource prm_xen_v15 > is active on 2 nodes (attempting recovery) >> >> Mar 30 16:19:00 h19 pacemaker-schedulerd[7350]: notice: * Restart > prm_xen_v15 ( h18 ) >> >> Mar 30 16:19:04 h18 VirtualDomain(prm_xen_v15)[8768]: INFO: Virtual domain > v15 currently has no state, retrying. >> Mar 30 16:19:05 h18 VirtualDomain(prm_xen_v15)[8787]: INFO: Virtual domain > v15 currently has no state, retrying. >> Mar 30 16:19:07 h18 VirtualDomain(prm_xen_v15)[8822]: ERROR: Virtual domain > v15 has no state during stop operation, bailing out. >> Mar 30 16:19:07 h18 VirtualDomain(prm_xen_v15)[8836]: INFO: Issuing forced > shutdown (destroy) request for domain v15. >> Mar 30 16:19:07 h18 VirtualDomain(prm_xen_v
Re: [ClusterLabs] Antw: [EXT] Re: Corosync Transport‑ Knet Vs UDPU
Corosync rings are never enough , especially when the network team has such naughty hands. Best Regards,Strahil Nikolov On Mon, Mar 28, 2022 at 16:55, Ulrich Windl wrote: >>> Strahil Nikolov via Users schrieb am 28.03.2022 um 15:49 in Nachricht <1758982440.559085.1648475365...@mail.yahoo.com>: > One huge benefit of the new stack is that you can have 8 corosync rings, > which is really powerful. Hm I wonder: If one ring is not good enough and two rings are not good enough, then per induction n+1 rings aren't good enough also, so 8 rings don't really help ;-) Sorry for the joke! Ulrich > Best Regards,Strahil Nikolov > > > On Mon, Mar 28, 2022 at 9:27, Christine caulfield > wrote: On 28/03/2022 03:30, Somanath Jeeva via Users wrote: >> Hi , >> >> I am upgrading from corosync 2.x/pacemaker 1.x to corosync 3.x/pacemaker >> 2.1.x >> >> In our use case we are using a 2 node corosync/pacemaker cluster. >> >> In corosync 2.x version I was using udpu as transport method. In the >> corosync 3.x , as per man pages, the default transport mode is knet . >> And in knet it uses udp as knet method. >> >> I have the below doubts on the transport method. >> >> 1. Does knet require any special configuration on network level(like >> multicast enabling). > > > No. What knet calls UDP is similar (from the user POV) to corosync's > UDPU, it's a unicast transport and doesn't need any multicast > configuration. > > Sorry that's confusing, but it's more technically 'correct'. The main > reason UDPU was called that was because it was new to corosync when the > old (multicast) UDP protocol caused trouble for some people without good > multicast networks. > > >> 2. In corosync 2.x udp was used for multicast, in knet transport does >> udp mean multicast. > > No, see above. There is no multicast transport in knet. > >> 3. Will udpu be deprecated in future. >> > > Yes. We strongly recommend people use knet as the corosync transport as > that is the one getting most development. The old UDP/UDPU protocols > will only get bugfixes. Knet provides muti-homing up to 8 links, and > link priorities and much more. > > I wrote a paper on this when we first introduced knet into corosync > which might help: > > https://people.redhat.com/ccaulfie/docs/KnetCorosync.pdf > > > Chrissie > > >> Kindly help me with these doubts. >> >> With Regards >> >> Somanath Thilak J >> >> >> ___ >> Manage your subscription: >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> ClusterLabs home: https://www.clusterlabs.org/ > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Corosync Transport- Knet Vs UDPU
One huge benefit of the new stack is that you can have 8 corosync rings, which is really powerful. Best Regards,Strahil Nikolov On Mon, Mar 28, 2022 at 9:27, Christine caulfield wrote: On 28/03/2022 03:30, Somanath Jeeva via Users wrote: > Hi , > > I am upgrading from corosync 2.x/pacemaker 1.x to corosync 3.x/pacemaker > 2.1.x > > In our use case we are using a 2 node corosync/pacemaker cluster. > > In corosync 2.x version I was using udpu as transport method. In the > corosync 3.x , as per man pages, the default transport mode is knet . > And in knet it uses udp as knet method. > > I have the below doubts on the transport method. > > 1. Does knet require any special configuration on network level(like > multicast enabling). No. What knet calls UDP is similar (from the user POV) to corosync's UDPU, it's a unicast transport and doesn't need any multicast configuration. Sorry that's confusing, but it's more technically 'correct'. The main reason UDPU was called that was because it was new to corosync when the old (multicast) UDP protocol caused trouble for some people without good multicast networks. > 2. In corosync 2.x udp was used for multicast, in knet transport does > udp mean multicast. No, see above. There is no multicast transport in knet. > 3. Will udpu be deprecated in future. > Yes. We strongly recommend people use knet as the corosync transport as that is the one getting most development. The old UDP/UDPU protocols will only get bugfixes. Knet provides muti-homing up to 8 links, and link priorities and much more. I wrote a paper on this when we first introduced knet into corosync which might help: https://people.redhat.com/ccaulfie/docs/KnetCorosync.pdf Chrissie > Kindly help me with these doubts. > > With Regards > > Somanath Thilak J > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Parsing the output of crm_mon
Also xmllint has '--xpath' (unless you are running something as old as RHEL6) and is available on every linux distro. Best Regards,Strahil Nikolov On Mon, Mar 21, 2022 at 15:41, Ken Gaillot wrote: On Mon, 2022-03-21 at 08:27 +0100, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 18.03.2022 um > > > > 13:39 in > Nachricht > : > > On Fri, 2022‑03‑18 at 08:46 +0100, Ulrich Windl wrote: > > > Hi! > > > > > > Parsing the output of crm_mon I wonder: > > > Is there a collection of sample outputs for pacemaker 1 and 2 > > > formats > > > showing all types of resources? > > > > Ideally, any parsing should be done of the XML output generated by > > ‑‑ > > output‑as=xml since 2.0.3 and ‑‑as‑xml before then (the output is > > identical other than the outermost tag). > > Agreed, but it's much trickier to parse XML with awk ;-) > Maybe it' even less efficient (unless crm_mon itself is much more > efficient > when out putting XML) > With XPath support, I might be able to create the output I need using > xrm_mon > only, but that's not implemented. > > Regards, > Ulrich xmlstarlet can search xpaths, e.g. crm_mon -1 --output-as=xml | xmlstarlet sel -t -v "//element/@attribute" > > > > The XML output is stable and only gets backward‑compatible > > additions > > once in a long while, but the text output changes more frequently > > and > > significantly. > > > > There's an RNG schema for it, api‑result.rng (where it's installed > > depends on your build; in the source repository, make generates it > > under xml/api). > > > > > Also I realized that the coutput for clone sets in unfortunate: > > > Consider a normal primitive like this: > > > * primitive_name (ocf::heartbeat:agent_name): Started > > > host‑name > > > And a clone set: > > > * Clone Set: clone_name [primitive_name]: > > > > > > If you want to filter clone sets by resource agent you're lost > > > there. > > > It would have been nicht if the format of clone sets were: > > > * Clone Set: clone_name [primitive_name] > > > (ocf::heartbeat:agent_name): > > > > > > I see that there's the "‑R" option that "expands" the clones > > > similar > > > as resource groups like this: > > > * primitive_name (ocf::heartbeat:agent): Started > > > host‑name > > > > > > Regards, > > > Ulrich > > > > > > > > > > > > > > > > > > > > > ___ > > > Manage your subscription: > > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > > > ClusterLabs home: https://www.clusterlabs.org/ > > > > > ‑‑ > > Ken Gaillot > > > > ___ > > Manage your subscription: > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > ClusterLabs home: https://www.clusterlabs.org/ > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] constraining multiple cloned resources to the same node
You can try creating a dummy resource and colocate all clones with it. Best Regards,Strahil Nikolov On Tue, Mar 15, 2022 at 20:53, john tillman wrote: > On 15.03.2022 19:35, john tillman wrote: >> Hello, >> >> I'm trying to guarantee that all my cloned drbd resources start on the >> same node and I can't figure out the syntax of the constraint to do it. >> >> I could nominate one of the drbd resources as a "leader" and have all >> the >> others follow it. But then if something happens to that leader the >> others >> are without constraint. >> > > Colocation is asymmetric. Resource B is colocated with resource A, so > pacemaker decides placement of resource A first. If resource A cannot > run anywhere (which is probably what you mean under "something happens > to that leader"), resource B cannot run anywhere. This is true also for > resources inside resource set. > > I do not think pacemaker supports "always run these resources together, > no matter how many resources can run". > Huh, no way to get all the masters to start on the same node. Interesting. The set construct has a boolean field "require-all". I'll try that before I give up. Could I create a resource (some systemd service) that all the masters are colocated with? Feels like a hack but would it work? Thank you for the response. -John >> I tried adding them to a group but got a syntax error from pcs saying >> that >> I wasn't allowed to add cloned resources to a group. >> >> If anyone is interested, it started from this example: >> https://edmondcck.medium.com/setup-a-highly-available-nfs-cluster-with-disk-encryption-using-luks-drbd-corosync-and-pacemaker-a96a5bdffcf8 >> There's a DRBD partition that gets mounted onto a local directory. The >> local directory is then mounted onto an exported directory (mount >> --bind). >> Then the nfs service (samba too) get started and finally the VIP. >> >> Please note that while I have 3 DRBD resources currently, that number >> may >> increase after the initial configuration is performed. >> >> I would just like to know a mechanism to make sure all the DRBD >> resources >> are colocated. Any suggestions welcome. >> >> [root@nas00 ansible]# pcs resource >> * Clone Set: drbdShare-clone [drbdShare] (promotable): >> * Masters: [ nas00 ] >> * Slaves: [ nas01 ] >> * Clone Set: drbdShareRead-clone [drbdShareRead] (promotable): >> * Masters: [ nas00 ] >> * Slaves: [ nas01 ] >> * Clone Set: drbdShareWrite-clone [drbdShareWrite] (promotable): >> * Masters: [ nas00 ] >> * Slaves: [ nas01 ] >> * localShare (ocf::heartbeat:Filesystem): Started nas00 >> * localShareRead (ocf::heartbeat:Filesystem): Started nas00 >> * localShareWrite (ocf::heartbeat:Filesystem): Started nas00 >> * nfsShare (ocf::heartbeat:Filesystem): Started nas00 >> * nfsShareRead (ocf::heartbeat:Filesystem): Started nas00 >> * nfsShareWrite (ocf::heartbeat:Filesystem): Started nas00 >> * nfsService (systemd:nfs-server): Started nas00 >> * smbService (systemd:smb): Started nas00 >> * vipN (ocf::heartbeat:IPaddr2): Started nas00 >> >> [root@nas00 ansible]# pcs constraint show --all >> Location Constraints: >> Ordering Constraints: >> promote drbdShare-clone then start localShare (kind:Mandatory) >> promote drbdShareRead-clone then start localShareRead (kind:Mandatory) >> promote drbdShareWrite-clone then start localShareWrite >> (kind:Mandatory) >> start localShare then start nfsShare (kind:Mandatory) >> start localShareRead then start nfsShareRead (kind:Mandatory) >> start localShareWrite then start nfsShareWrite (kind:Mandatory) >> start nfsShare then start nfsService (kind:Mandatory) >> start nfsShareRead then start nfsService (kind:Mandatory) >> start nfsShareWrite then start nfsService (kind:Mandatory) >> start nfsService then start smbService (kind:Mandatory) >> start nfsService then start vipN (kind:Mandatory) >> Colocation Constraints: >> localShare with drbdShare-clone (score:INFINITY) >> (with-rsc-role:Master) >> localShareRead with drbdShareRead-clone (score:INFINITY) >> (with-rsc-role:Master) >> localShareWrite with drbdShareWrite-clone (score:INFINITY) >> (with-rsc-role:Master) >> nfsShare with localShare (score:INFINITY) >> nfsShareRead with localShareRead (score:INFINITY) >> nfsShareWrite with localShareWrite (score:INFINITY) >> nfsService with nfsShare (score:INFINITY) >> nf
Re: [ClusterLabs] Cluster timeout
You can bump the 'token' to a higher value (for example 10s ) and adjust the consensus based on that value. See man 5 corosync.conf Don't forget to sync the nodes and reload the corosync stack. Of course proper testing on non-Prod is highly recommend. Note: Both parameters use milliseconds (at least based on the manpage) Best Regards,Strahil Nikolov On Wed, Mar 9, 2022 at 12:46, FLORAC Thierry wrote: #yiv4997566984 P {margin-top:0;margin-bottom:0;}Hi, I manage an active/passive PostgreSQL cluster using DRBD, LVM, Pacemaker and Corosync on a Debian GNU/Linux operating system.Everything is OK, but my platform seems to be quite "sensitive" to small network timeouts which are generating a cluster migration start from active to passive node; generally, the process doesn't go through to the end: as soon as the connection is back again, the migration is cancelled and the database restarts!That should be OK but on the application side, some database connections (on a Java WildFly server) can become "invalid"! So I would like to avoid these migrations when this kind of small timeout occurs... So my question is: which cluster settings can I change to increase the timeout before starting a cluster migration? Best regards,Thierry Thierry Florac Resp. Pôle Architecture Applicative et Mobile DSI - Dépt. Études et Solutions Tranverses 2, avenue de Saint-Mandé - 75570 Paris cedex 12 Tél : 01 40 19 59 64 www.onf.fr ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Q: fence_kdump and fence_kdump_send
I always used this one for triggering kdump when using sbd:https://www.suse.com/support/kb/doc/?id=19873 On Fri, Feb 25, 2022 at 21:34, Reid Wahl wrote: On Fri, Feb 25, 2022 at 3:47 AM Andrei Borzenkov wrote: > > On Fri, Feb 25, 2022 at 2:23 PM Reid Wahl wrote: > > > > On Fri, Feb 25, 2022 at 3:22 AM Reid Wahl wrote: > > > > ... > > > > > > > > So what happens most likely is that the watchdog terminates the kdump. > > > > In that case all the mess with fence_kdump won't help, right? > > > > > > You can configure extra_modules in your /etc/kdump.conf file to > > > include the watchdog module, and then restart kdump.service. For > > > example: > > > > > > # grep ^extra_modules /etc/kdump.conf > > > extra_modules i6300esb > > > > > > If you're not sure of the name of your watchdog module, wdctl can help > > > you find it. sbd needs to be stopped first, because it keeps the > > > watchdog device timer busy. > > > > > > # pcs cluster stop --all > > > # wdctl | grep Identity > > > Identity: i6300ESB timer [version 0] > > > # lsmod | grep -i i6300ESB > > > i6300esb 13566 0 > > > > > > > > > If you're also using fence_sbd (poison-pill fencing via block device), > > > then you should be able to protect yourself from that during a dump by > > > configuring fencing levels so that fence_kdump is level 1 and > > > fence_sbd is level 2. > > > > RHKB, for anyone interested: > > - sbd watchdog timeout causes node to reboot during crash kernel > > execution (https://access.redhat.com/solutions/3552201) > > What is not clear from this KB (and quotes from it above) - what > instance updates watchdog? Quoting (emphasis mine) > > --><-- > With the module loaded, the timer *CAN* be updated so that it does not > expire and force a reboot in the middle of vmcore generation. > --><-- > > Sure it can, but what program exactly updates the watchdog during > kdump execution? I am pretty sure that sbd does not run at this point. That's a valid question. I found this approach to work back in 2018 after a fair amount of frustration, and didn't question it too deeply at the time. The answer seems to be that the kernel does it. - https://stackoverflow.com/a/2020717 - https://stackoverflow.com/a/42589110 > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > -- Regards, Reid Wahl (He/Him), RHCA Senior Software Maintenance Engineer, Red Hat CEE - Platform Support Delivery - ClusterHA ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Booth ticket multi-site and quorum /Pacemaker
man votequorum auto_tie_breaker: 1 allows you to have quorum with 50%, yet if for example Aside (node with lowest id) dies, B side is 50% but won't be able to bring back the resources as the node with lowest id is in A side.If you want to avoid that, you can bring a qdevice on a VM in third location (even in a cloud nearby). Best Regards,Strahil Nikolov On Fri, Feb 25, 2022 at 20:10, Viet Nguyen wrote: ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: Re: Antw: [EXT] The 2 servers of the cluster randomly reboot almost together
Strangely I can't see any timeouts set in at the example in https://pve.proxmox.com/wiki/Fencing ? Best Regards,Strahil Nikolov On Tue, Feb 22, 2022 at 18:54, Sebastien BASTARD wrote: Hello Strahil, I don't have pcs software (corosync is embedded in proxmox), but I have "pvecm status" : Cluster information --- Name: cluster Config Version: 24 Transport: knet Secure auth: on Quorum information -- Date: Tue Feb 22 17:52:06 2022 Quorum provider: corosync_votequorum Nodes: 2 Node ID: 0x0003 Ring ID: 1.5130 Quorate: Yes Votequorum information -- Expected votes: 3 Highest expected: 3 Total votes: 3 Quorum: 2 Flags: Quorate Qdevice Membership information -- Nodeid Votes Qdevice Name 0x0001 1 A,V,NMW serverA 0x0003 1 A,V,NMW serverB (local) 0x 1 Qdevice Hope you can find the kind of fencing. Best regards. Le mar. 22 févr. 2022 à 17:40, Strahil Nikolov a écrit : fencing is the reboot mechanism pcs status Best Regards,Strahil Nikolov On Tue, Feb 22, 2022 at 16:44, Sebastien BASTARD wrote: Hello Strahil, As I don't know the kind of fencing, here is the current configuration of corosync : logging { debug: off to_syslog: yes} nodelist { node { name: serverA nodeid: 1 quorum_votes: 1 ring0_addr: xx.xx.xx.xx } node { name: serverB nodeid: 3 quorum_votes: 1 ring0_addr: xx.xx.xx.xx }} quorum { device { model: net net { algorithm: ffsplit host: xx.xx.xx.xx tls: on } votes: 1 } provider: corosync_votequorum} totem { cluster_name: cluster config_version: 24 interface { linknumber: 0 } ip_version: ipv4-6 link_mode: passive secauth: on version: 2 token_retransmits_before_loss_const: 40 token: 3 } Best regards. Le mar. 22 févr. 2022 à 14:29, Strahil Nikolov a écrit : What kind of fencing are you using ? Best Regards,Strahil Nikolov On Tue, Feb 22, 2022 at 15:24, Sebastien BASTARD wrote: Hello Strahil Nikolov, Qdevice is not a vm. It is a Linux Debian, physical server. Best regards. Le mar. 22 févr. 2022 à 14:20, Strahil Nikolov a écrit : Is the qdevice on a VM ? Best Regards,Strahil Nikolov On Tue, Feb 22, 2022 at 15:03, Sebastien BASTARD wrote: ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ -- | | | Sébastien BASTARD Ingénieur R | Domalys • Créateurs d’autonomie | phone : +33 5 49 83 00 08 | site : www.domalys.com | email : sebast...@domalys.com | address : 58 Rue du Vercors 86240 Fontaine-Le-Comte | | | | | | | | | | | | | -- | | | Sébastien BASTARD Ingénieur R | Domalys • Créateurs d’autonomie | phone : +33 5 49 83 00 08 | site : www.domalys.com | email : sebast...@domalys.com | address : 58 Rue du Vercors 86240 Fontaine-Le-Comte | | | | | | | | | | | | | ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: Re: Antw: [EXT] The 2 servers of the cluster randomly reboot almost together
fencing is the reboot mechanism pcs status Best Regards,Strahil Nikolov On Tue, Feb 22, 2022 at 16:44, Sebastien BASTARD wrote: Hello Strahil, As I don't know the kind of fencing, here is the current configuration of corosync : logging { debug: off to_syslog: yes} nodelist { node { name: serverA nodeid: 1 quorum_votes: 1 ring0_addr: xx.xx.xx.xx } node { name: serverB nodeid: 3 quorum_votes: 1 ring0_addr: xx.xx.xx.xx }} quorum { device { model: net net { algorithm: ffsplit host: xx.xx.xx.xx tls: on } votes: 1 } provider: corosync_votequorum} totem { cluster_name: cluster config_version: 24 interface { linknumber: 0 } ip_version: ipv4-6 link_mode: passive secauth: on version: 2 token_retransmits_before_loss_const: 40 token: 3 } Best regards. Le mar. 22 févr. 2022 à 14:29, Strahil Nikolov a écrit : What kind of fencing are you using ? Best Regards,Strahil Nikolov On Tue, Feb 22, 2022 at 15:24, Sebastien BASTARD wrote: Hello Strahil Nikolov, Qdevice is not a vm. It is a Linux Debian, physical server. Best regards. Le mar. 22 févr. 2022 à 14:20, Strahil Nikolov a écrit : Is the qdevice on a VM ? Best Regards,Strahil Nikolov On Tue, Feb 22, 2022 at 15:03, Sebastien BASTARD wrote: ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ -- | | | Sébastien BASTARD Ingénieur R | Domalys • Créateurs d’autonomie | phone : +33 5 49 83 00 08 | site : www.domalys.com | email : sebast...@domalys.com | address : 58 Rue du Vercors 86240 Fontaine-Le-Comte | | | | | | | | | | | | | ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: Re: Antw: [EXT] The 2 servers of the cluster randomly reboot almost together
What kind of fencing are you using ? Best Regards,Strahil Nikolov On Tue, Feb 22, 2022 at 15:24, Sebastien BASTARD wrote: Hello Strahil Nikolov, Qdevice is not a vm. It is a Linux Debian, physical server. Best regards. Le mar. 22 févr. 2022 à 14:20, Strahil Nikolov a écrit : Is the qdevice on a VM ? Best Regards,Strahil Nikolov On Tue, Feb 22, 2022 at 15:03, Sebastien BASTARD wrote: ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: Re: Antw: [EXT] The 2 servers of the cluster randomly reboot almost together
Is the qdevice on a VM ? Best Regards,Strahil Nikolov On Tue, Feb 22, 2022 at 15:03, Sebastien BASTARD wrote: ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Help with PostgreSQL Automatic Failover demotion
Also,there is a way to tell the cluster to cleanup failures -> failure-timeout Best Regards,Strahil Nikolov On Sat, Feb 19, 2022 at 1:52, Jehan-Guillaume de Rorthais wrote: Hello, On Fri, 18 Feb 2022 21:44:58 + "Larry G. Mills" wrote: > ... This happened again recently, and the running primary DB was demoted and > then re-promoted to be the running primary. What I'm having trouble > understanding is why the running Master/primary DB was demoted. After the > monitor operation timed out, the failcount for the ha-db resource was still > less than the configured "migration-threshold", which is set to 5. Because "migration-threshold" is the limit before the resource is moved away from the node. As long as your failcount is less than "migration-threshold" and the failure is not fatal, the cluster will keep the resource on the same node and try to "recover" it by running a full restart: demote -> stop -> start -> promote. Since 2.0, the recover action can be demote -> promote. See the "on-fail" property and the detail about it below the table: https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#operation-properties Regards, ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] The 2 servers of the cluster randomly reboot almost together
Token timeout -> network issue ? Just run a continious ping (with timestamp) and log it into a file (from each host to other host + qdevice ip). Best Regards,Strahil Nikolov On Thu, Feb 17, 2022 at 11:38, Sebastien BASTARD wrote: Hello CoroSync's team ! We currently have a proxmox cluster with 2 servers (at different providers and different cities) and another server, in our company, with qdevice. Schematic : (A) Proxmox Server A (Provider One) -- (B) Proxmox Server B (Provider Two) | | \--/ | (C) Qdevice on Debian server (in the company) Between each server, we have approximately 50 ms of latency. Between servers A and B, each virtual server is synchronized each 5 minutes, so if a server stops working, the second server starts the same virtual server. We don't need High Availability. We can wait 5 minutes without services. After this delay, the virtual machine must start on another server if the first server does not work anymore. With the corosync default configuration, fencing occurs on the servers randomly (average of 4/5 days), so we modified the configuration with this (bold text is our modification) : logging { debug: off to_syslog: yes} nodelist { node { name: serverA nodeid: 1 quorum_votes: 1 ring0_addr: xx.xx.xx.xx } node { name: serverB nodeid: 3 quorum_votes: 1 ring0_addr: xx.xx.xx.xx }} quorum { device { model: net net { algorithm: ffsplit host: xx.xx.xx.xx tls: on } votes: 1 } provider: corosync_votequorum} totem { cluster_name: cluster config_version: 24 interface { linknumber: 0 } ip_version: ipv4-6 link_mode: passive secauth: on version: 2 token_retransmits_before_loss_const: 40 token: 3 } With this configuration, the fence of the servers continues but with an average of 15 days. Our current problem is that when fencing occurs on a server, the second server has the same behaviour somes minutes after ... And each time. I tested the cluster with a cut off power of the server A, and all worked great. Server B starts the virtual machines of server A. But in real life, when a server can't talk with another main server, it seems that the two servers believe that they isoled of other. So, after a lot of tests, I don't know which is the best way to have a cluster that works correctly.. Currently, the cluster stops working more than the servers have a real problem. Maybe my configuration is not good or another ? So, I need your help =) Here is the kernel logs of the reboot of the server A ( result the command line << cat /var/log/daemon.log | grep -E 'watchdog|corosync' >> ) : ... Feb 16 09:55:00 serverA corosync[2762]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1) Feb 16 09:55:00 serverA corosync[2762]: [KNET ] host: host: 3 has no active links Feb 16 09:55:22 serverA corosync[2762]: [TOTEM ] Token has not been received in 22500 ms Feb 16 09:55:30 serverA corosync[2762]: [TOTEM ] A processor failed, forming new configuration: token timed out (3ms), waiting 36000ms for consensus. Feb 16 09:55:38 serverA corosync[2762]: [KNET ] rx: host: 3 link: 0 is up Feb 16 09:55:38 serverA corosync[2762]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1) Feb 16 09:55:55 serverA watchdog-mux[1890]: client watchdog expired - disable watchdog updatesReboot Here is the kernel logs of the reboot of the server B ( result the command line << cat /var/log/daemon.log | grep -E 'watchdog|corosync' >> ) : Feb 16 09:48:42 serverB corosync[2728]: [KNET ] link: host: 1 link: 0 is down Feb 16 09:48:42 serverB corosync[2728]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1) Feb 16 09:48:42 serverB corosync[2728]: [KNET ] host: host: 1 has no active links Feb 16 09:48:57 serverB corosync[2728]: [KNET ] rx: host: 1 link: 0 is up Feb 16 09:48:57 serverB corosync[2728]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1) Feb 16 09:53:56 serverB corosync[2728]: [KNET ] link: host: 1 link: 0 is down Feb 16 09:53:56 serverB corosync[2728]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1) Feb 16 09:53:56 serverB corosync[2728]: [KNET ] host: host: 1 has no active links Feb 16 09:54:12 serverB corosync[2728]: [KNET ] rx: host: 1 link: 0 is up Feb 16 09:54:12 serverB corosync[2728]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1) Feb 16 09:55:22 serverB corosync[2728]: [TOTEM ] Token has not been received in 22500 ms Feb 16 09:55:30 serverB corosync[2728]: [TOTEM ] A processor failed, forming new configuration: token timed out (3ms), waiting 36000ms for consensus. Feb 16 09:55:35 serverB corosync[2728]: [KNET ] li
Re: [ClusterLabs] Q: sbd: Which parameter controls "error: servant_md: slot read failed in servant."?
To be honest, I always check https://documentation.suse.com/sle-ha/15-SP3/html/SLE-HA-all/cha-ha-storage-protect.html#sec-ha-storage-protect-watchdog-timings for sbd and timings. Best Regards,Strahil Nikolov On Wed, Feb 16, 2022 at 19:31, Klaus Wenninger wrote: ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] VirtualDomain + GlusterFS - troubles coming with CentOS 9
I haven't heard about removal of the libgfapi, so most probably ot's a packaging issue. The FUSE mount point can be setup via a cloned Filesystem resource and there should be no problems with it and live migration should work. Best Regards,Strahil Nikolov On Tue, Feb 15, 2022 at 19:16, lejeczek via Users wrote: Hi guys. With CentOS 9's packages & binaries libgfapi is removed from libvirt/qemu which means that if you want to use GlusterFS for VMs image storage you have to expose its volumes via FS mount point - that is how I understand these changes - which seems to cause quite a problem to the HA. Maybe it's only my setup, yet I'd think it'll be the only way forward for everybody here - if on Gluster - having a simple mount point and VM resources running off it, causes HA cluster to fail to live-migrate but only @ system reboot and makes such a reboot/shutdown take "ages". Anybody sees the same or similar problem? many thanks, L. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Cluster Removing VIP and Not Following Order Constraint
Ah, it's a HANA. Last HANA I did had something like this: colocation constraint-> VIP with Master HANAorder constraint -> First HANA clone (don't specify master role) -> then IP That way ,when the standby HANA joins and the master is demoted (kind of challanged) and afterwards samo old primary is promoted back, the IP never disappeared while it always started on the correct side (where the master is). Best Regards,Strahil Nikolov On Fri, Feb 11, 2022 at 10:38, Jonno wrote: Hello all, Thank you for your assistance. Below is the config from my lab environment. By the way, I just tried Strahil's suggestions, but it didn't seem to have any effect on the behaviour. Regards,Jonathan node 1: senzhana3 \ attributes hana_abc_op_mode=logreplay hana_abc_vhost=senzhana3 hana_abc_site=SITEA hana_abc_srmode=sync lpa_abc_lpt=10 hana_abc_remoteHost=senzhana4 node 2: senzhana4 \ attributes lpa_abc_lpt=1644568482 hana_abc_op_mode=logreplay hana_abc_vhost=senzhana4 hana_abc_site=SITEB hana_abc_srmode=sync hana_abc_remoteHost=senzhana3 primitive rsc_SAPHanaTopology_ABC_HDB96 ocf:suse:SAPHanaTopology \ operations $id=rsc_sap2_ABC_HDB96-operations \ op monitor interval=10 timeout=600 \ op start interval=0 timeout=600 \ op stop interval=0 timeout=300 \ params SID=ABC InstanceNumber=96 primitive rsc_SAPHana_ABC_HDB96 ocf:suse:SAPHana \ operations $id=rsc_sap_ABC_HDB96-operations \ op start interval=0 timeout=3600 \ op stop interval=0 timeout=3600 \ op promote interval=0 timeout=3600 \ op monitor interval=60 role=Master timeout=700 \ op monitor interval=61 role=Slave timeout=700 \ params SID=ABC InstanceNumber=96 PREFER_SITE_TAKEOVER=true DUPLICATE_PRIMARY_TIMEOUT=7200 AUTOMATED_REGISTER=false primitive rsc_hsr_quiesce lsb:hsr_quiesce primitive rsc_hsr_resume lsb:hsr_resume primitive rsc_ip_ABC_HDB96 IPaddr2 \ operations $id=rsc_ip_ABC_HDB96-operations \ op monitor interval=10s timeout=20s \ params ip=192.168.178.166 primitive stonith-sbd stonith:external/sbd \ params pcmk_delay_max=30 \ meta target-role=Started ms msl_SAPHana_ABC_HDB96 rsc_SAPHana_ABC_HDB96 \ meta clone-max=2 clone-node-max=1 interleave=true target-role=Master clone cln_SAPHanaTopology_ABC_HDB96 rsc_SAPHanaTopology_ABC_HDB96 \ meta clone-node-max=1 interleave=true location cli-prefer-rsc_hsr_quiesce rsc_hsr_quiesce role=Started inf: senzhana3 location cli-prefer-rsc_ip_ABC_HDB96 rsc_ip_ABC_HDB96 role=Started inf: senzhana4 colocation col_saphana_ip_ABC_HDB96 2000: rsc_ip_ABC_HDB96:Started msl_SAPHana_ABC_HDB96:Master rsc_hsr_quiesce rsc_hsr_resume order ord_SAPHana_ABC_HDB96 Optional: cln_SAPHanaTopology_ABC_HDB96 msl_SAPHana_ABC_HDB96 order ord_failover_ABC_HDB96 rsc_hsr_quiesce rsc_ip_ABC_HDB96 msl_SAPHana_ABC_HDB96:promote rsc_hsr_resume property cib-bootstrap-options: \ have-watchdog=true \ dc-version="2.0.4+20200616.2deceaa3a-3.12.1-2.0.4+20200616.2deceaa3a" \ cluster-infrastructure=corosync \ cluster-name=hacluster \ stonith-enabled=true \ stonith-action=reboot \ stonith-timeout=150s \ last-lrm-refresh=1644567089 rsc_defaults rsc-options: \ resource-stickiness=1000 \ migration-threshold=5000 op_defaults op-options: \ timeout=600 \ record-pending=true \ no-quorum-policy=ignore On Fri, 11 Feb 2022 at 21:29, Klaus Wenninger wrote: On Fri, Feb 11, 2022 at 9:13 AM Strahil Nikolov via Users wrote: Shouldn't you use kind ' Mandatory' and simetrical TRUE ? If true, the reverse of the constraint applies for the opposite action (for example, if B starts after A starts, then B stops before A stops). If the script should be run before any change then it sounds as if an asymmetric order would be desirable.So you might create at least two order constraints explicitly listing the actions.But I doubt that this explains the unexpected behavior described.As Ulrich said a little bit more info about the config would be helpful. Regards,Klaus Best Regards,Strahil Nikolov On Fri, Feb 11, 2022 at 9:11, Ulrich Windl wrote: >>> Jonno schrieb am 10.02.2022 um 20:43 in Nachricht : > Hello, > > I am having some trouble getting my 2 node active/passive cluster to do > what I want. More specifically, my cluster is removing the VIP from the > cluster whenever I attempt a failover with a command such as "crm resource > move rsc_cluster_vip node2". > > When running the command above, I am asking the cluster to migrate the VIP > to the standby node, but I am expecting the cluster to honour the order > constraint, by first running the script resource named "rsc_lsb_quiesce". > The order constraint looks like: > > "order order_ABC rsc_lsb_quiesce rsc_cluster_vip msl_ABC:promo
Re: [ClusterLabs] Antw: [EXT] Cluster Removing VIP and Not Following Order Constraint
Shouldn't you use kind ' Mandatory' and simetrical TRUE ? If true, the reverse of the constraint applies for the opposite action (for example, if B starts after A starts, then B stops before A stops). Best Regards,Strahil Nikolov On Fri, Feb 11, 2022 at 9:11, Ulrich Windl wrote: >>> Jonno schrieb am 10.02.2022 um 20:43 in Nachricht : > Hello, > > I am having some trouble getting my 2 node active/passive cluster to do > what I want. More specifically, my cluster is removing the VIP from the > cluster whenever I attempt a failover with a command such as "crm resource > move rsc_cluster_vip node2". > > When running the command above, I am asking the cluster to migrate the VIP > to the standby node, but I am expecting the cluster to honour the order > constraint, by first running the script resource named "rsc_lsb_quiesce". > The order constraint looks like: > > "order order_ABC rsc_lsb_quiesce rsc_cluster_vip msl_ABC:promote > rsc_lsb_resume" > > But it doesn't seem to do what I expect. It always removes the VIP entirely > from the cluster first, then it starts to follow the order constraint. This > means my cluster is in a state where the VIP is completely gone for a > couple of minutes. I've also tried doing a "crm resource move > rsc_lsb_quiesce > node2" hoping to trigger the script resource first, but the cluster always > removes the VIP before doing anything. > > My question is: How can I make the cluster follow this order constraint? I I'm very sure you just made a configuration mistake. But nobody can help you unless you show your configuration and example execution of events, plus the expected order of execution. Regards, Ulrich > need the cluster to run the "rsc_lsb_quiesce" script against a remote > application server before any other action is taken. I especially need the > VIP to stay where it is. Should I be doing this another way? > > Regards, > Jonathan ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] what is the "best" way to completely shutdown a two-node cluster ?
One drawback of that approach is that adding a the resource stop command will also prevent the resources from starting once the UPS gets enough power and start the servers. Of course a script in cron (@reboot) or in systemd can overcome it. Best Regards,Strahil Nikolov On Thu, Feb 10, 2022 at 11:15, Jehan-Guillaume de Rorthais wrote: On Wed, 9 Feb 2022 17:42:35 + (UTC) Strahil Nikolov via Users wrote: > If you gracefully shutdown a node - pacemaker will migrate all resources away > so you need to shut them down simultaneously and all resources should be > stopped by the cluster. > > Shutting down the nodes would be my choice. If you want to gracefully shutdown your cluster, then you can add one manual step to first gracefully stop your resources instead of betting the cluster will do the good things. As far as I remember, there's no way the DC/CRM can orchestrate the whole cluster shutdown gracefully in the same transition. So I prefer standing on the safe side and add one step to my procedure. I even add it with commands like `pcs cluster stop --all` which tries to shutdown all the nodes "kind of" in the same time everywhere. At least, I know where my resources were stopped and how they will start. It might be important when you deal with eg. permanent promotion scores. ++ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] what is the "best" way to completely shutdown a two-node cluster ?
If you gracefully shutdown a node - pacemaker will migrate all resources away so you need to shut them down simultaneously and all resources should be stopped by the cluster. Shutting down the nodes would be my choice. Best Regards,Strahil Nikolov On Wed, Feb 9, 2022 at 12:52, Lentes, Bernd wrote: - On Feb 9, 2022, at 11:26 AM, Jehan-Guillaume de Rorthais j...@dalibo.com wrote: > > I'm not sure how "crm resource stop " actually stop a resource. I thought > it would set "target-role=Stopped", but I might be wrong. > > If "crm resource stop" actually use "target-role=Stopped", I believe the > resources would not start automatically after setting > "stop-all-resources=false". > ha-idg-2:~ # crm resource help stop Stop resources Stop one or more resources using the target-role attribute. If there are multiple meta attributes sets, the attribute is set in all of them. If the resource is a clone, all target-role attributes are removed from the children resources. For details on group management see options manage-children. Usage: stop [ ...] Bernd ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Removing a resource without stopping it
I know... and the editor stuff can be bypassed, if the approach works. Best Regards,Strahil Nikolov On Sat, Jan 29, 2022 at 15:43, Digimer wrote:On 2022-01-29 03:16, Strahil Nikolov wrote: I think there is pcs cluster edit --scope=resources (based on memory). Can you try to delete it from there ? Best Regards, Strahil Nikolov Thanks, but no that doesn't seem to work. 'pcs cluster edit' wants to open an editor, and I'm trying to find a way to make this change with a program (once I sort out the manual process). So an option that requires user input won't work in my case regardless. Thank you just the same though! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Removing a resource without stopping it
I think there is pcs cluster edit --scope=resources (based on memory).Can you try to delete it from there ? Best Regards,Strahil Nikolov On Sat, Jan 29, 2022 at 7:12, Digimer wrote: ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] heads up: Possible VM data corruption upgrading to SLES15 SP3
Are you using HA-LVM or CLVM ? Best Regards,Strahil Nikolov On Thu, Jan 27, 2022 at 16:10, Ulrich Windl wrote: Hi! I know this is semi-offtopic, but I think it's important: I've upgraded one cluster node being a Xen host from SLES15 SP2 to SLES15 SP3 using virtual DVD boot (i.e. the upgrade environment is loaded from that DVD). Watching the syslog while Yast was searching for systems to upgrade, I noticed that it tried to mount _every_ disk read-only. (We use multipathed FC SAN disks that are attached as block devices to the VMs, so they look like "normal" disks) On my first attempt I did not enable multipath as it it not needed to upgrade the OS (system VG is single-pathed), but then LVM complained about multiple disks having the same ID. On the second attempt I did activate multipathing, but then Yast mounted every disk and tried to assemble every MDRAID it found, even if that was on shared storage, thus being actively in use by the other cluster nodes. To make things worse, even when mounting read-only, XFS (for example) tried to "recover" a filesystem when it thinks it is dirty. I found no way to avoid that mounting (a support case at SUSE is in progress). Fortunately if the VMs were running for a significant time, most blocks are cached inside the VM, and blocks are "mostly written" instead of being read. So most likely the badly recovered blocks are overwritten with good data before the machine reboots and the bad blocks would be read. This most obvious "solution" to stop every VM on the whole cluster before upgrading a single node is not very HA-like, unfortunately. Any better ideas anyone? Regards, Ulrich ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Feedback wanted: Native language support for Pacemaker help output
I thought we got HAWK in SUSE and pcsd's web UI in RHEL for the less-experienced admins. Anyway, it's my personal opinion... Off-topic: Automation module(s) (for example Ansible, SALT or Terraform) would bring bigger benefit to the project. I was always hoping to find something idempotent and build all clusters from a single role/formula, absolutely perfect and without missing components, packages or settings. Best Regards, Strahil Nikolov В четвъртък, 13 януари 2022 г., 19:35:57 Гринуич+2, Ken Gaillot написа: I think the use case is where senior admins can do what you're talking about, but less-experienced admins need to run cluster commands occasionally. On Tue, 2022-01-11 at 12:17 +, Strahil Nikolov via Users wrote: > To be honest, I don't see any benefit. > Even if you have the stack translated, when a more complex setup is > needed -> you will always have to search in the source/github > issues/documentation/mailing list history and rely on English. > > Best Regards, > Strahil Nikolov > > > On Tue, Jan 11, 2022 at 9:23, Ulrich Windl > > wrote: > > >>> Ken Gaillot schrieb am 10.01.2022 um > > 22:01 in > > Nachricht > > <14b79b428e0cc1ab4c5c882f0efcae3c221d9b2d.ca...@redhat.com>: > > > Re‑raising this due to the recent holidays ... > > > > > > Is translation of Pacemaker option help and man pages something > > people > > > would like to see? > > > > Interestingly some big companies frequently offer these languages: > > Englich, French, Chinese, Japanese > > > > My guess is that those are mainly there because the people refuse > > English for > > some reason or the other. > > Specifically note that German is not in that list. > > > > > > > > Would anyone be willing to contribute or proofread translations > > if the > > > tools were easy? > > > > I helped with a German translation in some Android app once, and I > > realized > > that it was much more work than I had assumed originally... > > > > Regards, > > Ulrich > > > > > > > > On Fri, 2021‑12‑03 at 15:02 ‑0600, Ken Gaillot wrote: > > >> Hi all, > > >> > > >> A user has graciously submitted a pull request to demonstrate > > native > > >> language support for Pacemaker help output: > > >> > > >> https://github.com/ClusterLabs/pacemaker/pull/2564 > > >> > > >> Long ago, we had a few translations of "Clusters from Scratch" > > and > > >> "Pacemaker Explained", but those proved to be too large and > > >> frequently > > >> changing to be maintainable. > > >> > > >> Log messages also change frequently, making translations > > difficult to > > >> maintain. Not to mention, commercial support personnel will not > > >> always > > >> be able to read the native languages of their users. > > >> > > >> However, help output (cluster property descriptions, man pages, > > etc.) > > >> is seen only by the end user, and doesn't change as often. The > > pull > > >> request implements a Chinese translation of some cluster option > > help > > >> as > > >> a proof‑of‑concept using GNU gettext. > > >> > > >> Since this would be a substantial change and require additional > > >> maintenance, I'd like to get as much feedback as possible. Reply > > to > > >> this email, or comment on the PR if you're interested in more > > >> technical > > >> details. > > >> > > >> If we implement this, we'd need volunteers for making and > > >> proofreading > > >> translations. Ideally, we'd also sign up with an online service > > that > > >> provides a friendly web interface for translations, but with > > this > > >> initial proof‑of‑concept, it involves github pull requests and > > >> reviews. > > >> > > >> Thoughts? > > > ‑‑ > > > Ken Gaillot > > > > > > ___ > > > Manage your subscription: > > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > > > ClusterLabs home: https://www.clusterlabs.org/ > > > > > > > > > > ___ > > Manage your subscription: > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > ClusterLabs home: https://www.clusterlabs.org/ > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Feedback wanted: Native language support for Pacemaker help output
To be honest, I don't see any benefit.Even if you have the stack translated, when a more complex setup is needed -> you will always have to search in the source/github issues/documentation/mailing list history and rely on English. Best Regards,Strahil Nikolov On Tue, Jan 11, 2022 at 9:23, Ulrich Windl wrote: >>> Ken Gaillot schrieb am 10.01.2022 um 22:01 in Nachricht <14b79b428e0cc1ab4c5c882f0efcae3c221d9b2d.ca...@redhat.com>: > Re‑raising this due to the recent holidays ... > > Is translation of Pacemaker option help and man pages something people > would like to see? Interestingly some big companies frequently offer these languages: Englich, French, Chinese, Japanese My guess is that those are mainly there because the people refuse English for some reason or the other. Specifically note that German is not in that list. > > Would anyone be willing to contribute or proofread translations if the > tools were easy? I helped with a German translation in some Android app once, and I realized that it was much more work than I had assumed originally... Regards, Ulrich > > On Fri, 2021‑12‑03 at 15:02 ‑0600, Ken Gaillot wrote: >> Hi all, >> >> A user has graciously submitted a pull request to demonstrate native >> language support for Pacemaker help output: >> >> https://github.com/ClusterLabs/pacemaker/pull/2564 >> >> Long ago, we had a few translations of "Clusters from Scratch" and >> "Pacemaker Explained", but those proved to be too large and >> frequently >> changing to be maintainable. >> >> Log messages also change frequently, making translations difficult to >> maintain. Not to mention, commercial support personnel will not >> always >> be able to read the native languages of their users. >> >> However, help output (cluster property descriptions, man pages, etc.) >> is seen only by the end user, and doesn't change as often. The pull >> request implements a Chinese translation of some cluster option help >> as >> a proof‑of‑concept using GNU gettext. >> >> Since this would be a substantial change and require additional >> maintenance, I'd like to get as much feedback as possible. Reply to >> this email, or comment on the PR if you're interested in more >> technical >> details. >> >> If we implement this, we'd need volunteers for making and >> proofreading >> translations. Ideally, we'd also sign up with an online service that >> provides a friendly web interface for translations, but with this >> initial proof‑of‑concept, it involves github pull requests and >> reviews. >> >> Thoughts? > ‑‑ > Ken Gaillot > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Which verson of pacemaker/corosync provides crm_feature_set 3.0.10?
Have you tried with a Fedora package from the archives? I found https://dl.fedoraproject.org/pub/archive/fedora/linux/releases/23/Everything/x86_64/os/Packages/p/pacemaker-1.1.13-3.fc23.x86_64.rpm & https://dl.fedoraproject.org/pub/archive/fedora/linux/releases/23/Everything/x86_64/os/Packages/c/corosync-2.3.5-1.fc23.x86_64.rpm which theoretically should be close enough. P.S.: I couldn't find those versions for Fedora 22, but they seem available for F23. Best Regards, Strahil Nikolov В вторник, 23 ноември 2021 г., 21:11:58 Гринуич+2, vitaly написа: Hello, I am working on the upgrade from older version of pacemaker/corosync to the current one. In the interim we need to sync newly installed node with the node running old software. Our old node uses pacemaker 1.1.13-3.fc22 and corosync 2.3.5-1.fc22 and has crm_feature_set 3.0.10. For interim sync I used pacemaker 1.1.18-2.fc28 and corosync 2.4.4-1.fc28. This version is using crm_feature_set 3.0.14. This version is working fine, but it has issues in some edge cases, like when the new node starts alone and then the old one tries to join. So I need to rebuild rpms for crm_feature_set 3.0.10. This will be used just once and then it will be upgraded to the latest versions of pacemaker and corosync. Now, couple of questions: 1. Which rpm defines crm_feature_set? 2. Which version of this rpm has crm_feature_set 3.0.10? 3. Where could I get source rpms to rebuild this rpm on CentOs 8? Thanks a lot! _Vitaly Zolotusky ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] resource start after network reconnected
You are right, but usually when the SBD disk has failed, I always focus on recovering it as soon as possible. Once the disk is recovered and the watcher detects it back - shutting down is possible. And of course disk-based sbd is better than nothing. Best Regards,Strahil Nikolov On Sun, Nov 21, 2021 at 8:47, Andrei Borzenkov wrote: On 21.11.2021 00:39, Strahil Nikolov via Users wrote: > Nope, as long as you use SBD's integration with pacemaker. As the 2 nodes can > communicate between each other sbd won't act. I thinkt it was an entry like > this in the /etc/sysconfig/sbd: 'SBD_PACEMAKER=yes' > That's correct except it is impossible to stop pacemaker on one node under this condition because the remaining node will immediately commit suicide. It is not even possible to perform normal cluster shutdown. I wish SBD supported "deactivate" message to stop pretending that it knows better than administrator or - even better - understood that pacemaker is stopping intentionally. Currently there is no way around it (short of pkill -9 sbd) because systemd unit refuses manual SBD stop. > > On Sat, Nov 20, 2021 at 23:24, Valentin Vidić via >Users wrote: On Sat, Nov 20, 2021 at 08:33:26PM +, >Strahil Nikolov via Users wrote: >> You can also use this 3rd node to provide iSCSI and then the SBD will >> be disk-full :D . The good thing about this type of setup is that you >> do won't need to put location constraints for the 3rd node. > > Wouldn't that make the iSCSI node a SPOF? If the iSCSI goes down, SBD > resets both cluster nodes. > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] resource start after network reconnected
Nope, as long as you use SBD's integration with pacemaker. As the 2 nodes can communicate between each other sbd won't act. I thinkt it was an entry like this in the /etc/sysconfig/sbd: 'SBD_PACEMAKER=yes' On Sat, Nov 20, 2021 at 23:24, Valentin Vidić via Users wrote: On Sat, Nov 20, 2021 at 08:33:26PM +, Strahil Nikolov via Users wrote: > You can also use this 3rd node to provide iSCSI and then the SBD will > be disk-full :D . The good thing about this type of setup is that you > do won't need to put location constraints for the 3rd node. Wouldn't that make the iSCSI node a SPOF? If the iSCSI goes down, SBD resets both cluster nodes. -- Valentin ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] resource start after network reconnected
You can also use this 3rd node to provide iSCSI and then the SBD will be disk-full :D . The good thing about this type of setup is that you do won't need to put location constraints for the 3rd node. Also, check the ping resource -> you can set it up to "kick-out" all resources on failure of ping to a specific ip (for example the gateway). Once the network is restored, the node automatically becomes eligible to host the resources. Also consider more advanced resource agents like ocf:heartbeat:mysql to control your mysql/mariadb database and also a replication between a primary and secondary (a.k.a master-slave ) replication. Best Regards, Strahil Nikolov В петък, 19 ноември 2021 г., 21:46:22 Гринуич+2, john tillman написа: > On Fri, Nov 19, 2021 at 11:26:01AM -0500, john tillman wrote: >> Anyone have any other ideas for a configuration setting that will >> effectively do whatever 'pcs resource refresh' is doing when quorum is >> restored? > > Since you have three nodes you may want to use the third node as QDevice > instead: > > https://documentation.suse.com/sle-ha/15-SP1/html/SLE-HA-all/cha-ha-qdevice.html > > After that SBD can be configured in diskless mode to reset the node that > loses quorum: > > https://documentation.suse.com/sle-ha/15-SP1/html/SLE-HA-all/cha-ha-storage-protect.html#sec-ha-storage-protect-diskless-sbd > Thank you. I'll look into using the Qdevice in the next release. For now, I just have the three nodes with "vanilla" cluster packages. > -- > Valentin > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > > ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Fence node when network interface goes down
Have you tried with ping and a location constraint for avoiding hosts that cannot ping an extrrnal system. Best Regards,Strahil Nikolov On Mon, Nov 15, 2021 at 0:07, S Rogers wrote: Using on-fail=fence is what I initially tried, but it doesn't work unfortunately. It looks like this is because the ethmonitor monitor operation won't actually fail when it detects a downed interface. It'll only fail if it is unable to update the CIB, as per this comment: https://github.com/ClusterLabs/resource-agents/blob/4824a7a83765a0596b7d9856d00102f53c8ce123/heartbeat/ethmonitor#L518 ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] drbd nfs slave not working
Also, check what 'drbdadm' has to tell you. Both nodes should be in sync, otherwise pacemaker will prevent the failover. Best Regards,Strahil Nikolov On Sun, Nov 14, 2021 at 20:09, Andrei Borzenkov wrote: On 14.11.2021 19:47, Neil McFadyen wrote: > I have a Ubuntu 20.04 drbd nfs pacemaker/corosync setup for 2 nodes, it > was working fine before but now I can't get the 2nd node to show as a slave > under the Clone Set. So if I do a failover both nodes show as stopped. > > root@testnfs30:/etc/drbd.d# crm status > Cluster Summary: > * Stack: corosync > * Current DC: testnfs32 (version 2.0.3-4b1f869f0f) - partition with quorum > * Last updated: Sun Nov 14 11:35:09 2021 > * Last change: Sun Nov 14 10:31:41 2021 by root via cibadmin on testnfs30 > * 2 nodes configured > * 5 resource instances configured > > Node List: > * Node testnfs32: standby This means - no resource will be started on this node. If this is not intentional, return node to onlilne (crm node online testnfs32). > * Online: [ testnfs30 ] > > Full List of Resources: > * Resource Group: HA: > * vip (ocf::heartbeat:IPaddr2): Started testnfs30 > * fs_nfs (ocf::heartbeat:Filesystem): Started testnfs30 > * nfs (ocf::heartbeat:nfsserver): Started testnfs30 > * Clone Set: ms_drbd_nfs [drbd_nfs] (promotable): > * Masters: [ testnfs30 ] > * Stopped: [ testnfs32 ] > > This used to show as > * Slaves: [ testnfs32 ] > > testnfs30# cat /proc/drbd > version: 8.4.11 (api:1/proto:86-101) > srcversion: FC3433D849E3B88C1E7B55C > 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r- > ns:352 nr:368 dw:720 dr:4221 al:6 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f > oos:0 > > > testnfs30:/etc/drbd.d# drbdadm status > nfs1 role:Primary > volume:1 disk:UpToDate > peer role:Secondary > volume:1 replication:Established peer-disk:UpToDate > > root@testnfs30:/etc/drbd.d# crm config show > node 1: testnfs30 \ > attributes standby=off > node 2: testnfs32 \ > attributes standby=on > primitive drbd_nfs ocf:linbit:drbd \ > params drbd_resource=nfs1 \ > op monitor interval=31s timeout=20s role=Slave \ > op monitor interval=30s timeout=20s role=Master > primitive fs_nfs Filesystem \ > params device="/dev/drbd0" directory="/nfs1srv" fstype=ext4 > options="noatime,nodiratime" \ > op start interval=0 timeout=60 \ > op stop interval=0 timeout=120 \ > op monitor interval=15s timeout=60s > primitive nfs nfsserver \ > params nfs_init_script="/etc/init.d/nfs-kernel-server" > nfs_shared_infodir="/nfs1srv/nfs_shared" nfs_ip=172.17.1.35 \ > op monitor interval=5s > primitive vip IPaddr2 \ > params ip=172.17.1.35 cidr_netmask=16 nic=bond0 \ > op monitor interval=20s timeout=20s \ > op start interval=0s timeout=20s \ > op stop interval=0s timeout=20s > group HA vip fs_nfs nfs \ > meta target-role=Started > ms ms_drbd_nfs drbd_nfs \ > meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 > notify=true > order fs-nfs-before-nfs inf: fs_nfs:start nfs:start > order ip-before-ms-drbd-nfs Mandatory: vip:start ms_drbd_nfs:promote > location loc ms_drbd_nfs 100: testnfs30 > order ms-drbd-nfs-before-fs-nfs Mandatory: ms_drbd_nfs:promote fs_nfs:start > colocation ms-drbd-nfs-with-ha inf: ms_drbd_nfs:Master HA > property cib-bootstrap-options: \ > have-watchdog=false \ > dc-version=2.0.3-4b1f869f0f \ > cluster-infrastructure=corosync \ > cluster-name=debian \ > no-quorum-policy=ignore \ > stonith-enabled=false > > I noticed that this line was added since last time I checked so I removed > it but that didn't help' > > location drbd-fence-by-handler-nfs1-ms_drbd_nfs ms_drbd_nfs \ > rule $role=Master -inf: #uname ne testnfs32 > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] How to globally enable trace log level in pacemaker?
At least it's worth trying (/etc/sysconfig/pacemaker):PCMK_trace_files=* Best Regards,Strahil Nikolov On Sun, Oct 31, 2021 at 18:10, Vladislav Bogdanov wrote: ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] How to globally enable trace log level in pacemaker?
Have you checked the options in /etc/sysconfig/pacemaker as recommended in https://documentation.suse.com/sle-ha/15-SP3/html/SLE-HA-all/app-ha-troubleshooting.html#sec-ha-troubleshooting-log ? Best Regards, Strahil Nikolov В неделя, 31 октомври 2021 г., 13:33:43 ч. Гринуич+2, Andrei Borzenkov написа: On 31.10.2021 13:30, Jehan-Guillaume de Rorthais wrote: > Hi, > > Under EL and Debian, there's a PCMK_debug variable (iirc) in > "/etc/sysconfig/pacemaker" or "/etc/default/pacemaker". > > Comments in there explain how to set debug mode for part or all of the > pacemaker processes. > > This might be the environment variable you are looking for ? > It sets log level to debug, while I need trace. > Regards, > > Le 31 octobre 2021 09:20:00 GMT+01:00, Andrei Borzenkov > a écrit : >> I think it worked in the past by passing a lot of -VVV when starting >> pacemaker. It does not seem to work now. I can call /usr/sbin/pacemakerd >> -..., but it does pass options further to children it >> starts. So every other daemon is started without any option and with >> default log level. >> >> This pacemaker 2.1.0 from openSUSE Tumbleweed. >> >> P.S. environment variable to directly set log level would certainly be >> helpful. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] unexpected fenced node and promotion of the new master PAF ‑ postgres
Ah... That's the first thing I change.In SLES, that is defaulted to 10s and so far I have never seen an environment that is stable enough for the default 1s timeout. Best Regards,Strahil Nikolov On Sat, Oct 9, 2021 at 9:59, Jehan-Guillaume de Rorthais wrote: Le 9 octobre 2021 00:11:27 GMT+02:00, Strahil Nikolov a écrit : >What do you mean by 1s default timeout ? I suppose Damiano is talking about the corosync totem token timeout. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] unexpected fenced node and promotion of the new master PAF ‑ postgres
What do you mean by 1s default timeout ? Best Regards,Strahil Nikolov On Fri, Oct 8, 2021 at 16:02, damiano giuliani wrote: ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Problem with high load (IO)
These 'dirty' sysctl settings are configureable. For large sequential I/O it's desirable 'dirty' ratio/bytes to be bigger, while for small files/random I/O it's better to be kept low. Best Regards, Strahil Nikolov В вторник, 5 октомври 2021 г., 08:52:20 ч. Гринуич+3, Ulrich Windl написа: >>> Gang He via Users schrieb am 30.09.2021 um 03:55 in Nachricht : > > On 2021/9/29 16:20, Lentes, Bernd wrote: >> >> >> ‑ On Sep 29, 2021, at 4:37 AM, Gang He g...@suse.com wrote: >> >>> Hi Lentes, >>> >>> Thank for your feedback. >>> I have some questions as below, >>> 1) how to clone these VM images from each ocfs2 nodes via reflink? >>> do you encounter any problems during this step? >>> I want to say, this is a shared file system, you do not clone all VM >>> images from each node, duplicated. >>> 2) after the cloned VM images are created, how do you copy these VM >>> images? copy to another backup file system, right? >>> The problem usually happened during this step? >>> >>> Thanks >>> Gang >> >> 1) No problems during this step, the procedure just needs a few seconds. >> reflink is a binary. See reflink ‑‑help >> Yes, it is a cluster filesystem. I do the procedure just on one node, >> so i don't have duplicates. >> >> 2) just with "cp source destination" to a NAS. >> Yes, the problems appear during this step. > Ok, when you cp the cloned file to the NAS directory, > the NAS directory should be another file system, right? > During the copying process, the original VM running will be affected, > right? One issue, especially with large RAM systems is this: If you copy from a fast to a slow device, the RAM wil fil with dirty buffers, probably causing a read starvation (no discardable buffer available). So this can affect any unlelated process. (Actually in the past it affected the IPaddr monitor for us) However I think recent kernels (maybe it's SUSE specific) prevent the whole free RAM to be filled with dirty buffers. Regards, Ulrich > > Thanks > Gang > >> >> Bernd >> > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Problem with high load (IO)
Dif you try the 'ionice -c 2 -n 7 nice cp ' ? Best Regards,Strahil Nikolov On Thu, Sep 30, 2021 at 14:58, Lentes, Bernd wrote: - On Sep 30, 2021, at 3:55 AM, Gang He g...@suse.com wrote: >> >> 1) No problems during this step, the procedure just needs a few seconds. >> reflink is a binary. See reflink --help >> Yes, it is a cluster filesystem. I do the procedure just on one node, >> so i don't have duplicates. >> >> 2) just with "cp source destination" to a NAS. >> Yes, the problems appear during this step. > Ok, when you cp the cloned file to the NAS directory, > the NAS directory should be another file system, right? > During the copying process, the original VM running will be affected, > right? > Yes, it's another fs. Yes, the running machine is affected. It's getting slower and sometimes does not react, following our monitor software. Bernd ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] corosync/pacemaker resources start after reboot - incorrect node ID calculated
Yeah, it seems I missed the nodeid, so can you try setting the "name: hostname" in the corosync.conf ? Best Regards, Strahil Nikolov В вторник, 28 септември 2021 г., 10:34:41 ч. Гринуич+3, Strahil Nikolov via Users написа: Erm, in my corosync.conf I got also 'name: the-name-of-the-host' and 'nodeid: ' . I don't see these 2 in your config . Best Regards, Strahil Nikolov В вторник, 28 септември 2021 г., 02:39:20 ч. Гринуич+3, Neitzert, Greg A написа: Hello, We have an issue with a 2 node cluster where both nodes were put into standby (but the resources were not stopped first – so were still in target-role=Started). When the 2 nodes were rebooted, the corosync and pacemaker service started on the first node that came up, but the resources all tried to start, which should not have happened (standby persists through reboots by default). Upon closer inspection, it was found that the system calculated a different node ID than it usually has, and entered the cluster with the same hostname, but not the saved information from the previous cluster ID, so it didn’t remember it was in standby, and tried to come up. I believe the issue is a consequence of two factors. First, the network interface ring0 will use was in the state ‘setup-in-progress’ for some reason when the corosync and pacemaker started. Why exactly that was is still unknown. The corosync systemctl unit should wait until after network-online.target is reached, but that can mean various things, and doesn’t guarantee a particular interface is up. In our case, we use a dedicated network interface with a 169.x.x.x address to connect to the other node. Other interfaces were up, which probably explains why the target was reached. In normal cases, the nodeid calculated by corosync is something like 704514049, which converts to 169.254.8.1 which is the IP address of the ring0 interface. In this particular failing case, that didn’t happen, and it got a nodeid of 2130706433 which converts to 127.0.0.1. On start, the following logs of note were logged: corosync[3965]: [TOTEM ] The network interface is down. [TOTEM ] A new membership (127.0.0.1:4) was formed. Members joined: 2130706433 …. crmd[3978]: notice: Deleting unknown node 704514049/cbsta-mq1 which has conflicting uname with 2130706433 It was the above notice where I believe I lost my saved configuration from the correct node configuration. Here it indicates it is deleting the node that maps to the 169 address and is replacing it with the node id that maps to 127.0.0.1. Then all the various resources try to start on this node, which should not have happened (they should have been in standby). The pengine files verify that they were in standby, but after the new node id was joined, it did not have that setting, and the resources started because the target role was started for the resources before this all happened. It was shortly after that the interface we use for ring0 came up (eth-ha0): eth-ha0: link becomes ready After that the corosync service starts going down: 2021-09-16T00:43:20.022106+01:00 cbsta-mq1 attrd[3976]: notice: crm_update_peer_proc: Node cbsta-mq1[2130706433] - state is now lost (was member) 2021-09-16T00:43:20.022255+01:00 cbsta-mq1 cib[3973]: notice: crm_update_peer_proc: Node cbsta-mq1[2130706433] - state is now lost (was member) 2021-09-16T00:43:20.022373+01:00 cbsta-mq1 attrd[3976]: notice: Removing all cbsta-mq1 attributes for attrd_peer_change_cb 2021-09-16T00:43:20.022524+01:00 cbsta-mq1 cib[3973]: notice: Removing cbsta-mq1/2130706433 from the membership list 2021-09-16T00:43:20.022639+01:00 cbsta-mq1 attrd[3976]: notice: Lost attribute writer cbsta-mq1 2021-09-16T00:43:20.022743+01:00 cbsta-mq1 cib[3973]: notice: Purged 1 peers with id=2130706433 and/or uname=cbsta-mq1 from the membership cache 2021-09-16T00:43:20.022857+01:00 cbsta-mq1 attrd[3976]: notice: Removing cbsta-mq1/2130706433 from the membership list The service then restarts, but now it gets the correct node ID (mapping to 169). 2021-09-16T00:43:20.369715+01:00 cbsta-mq1 corosync[12434]: [TOTEM ] A new membership (169.254.8.1:12) was formed. Members joined: 704514049 2021-09-16T00:43:20.369830+01:00 cbsta-mq1 corosync[12434]: [QUORUM] Members[1]: 704514049 It then tries starting resources again, because it has lost previous information apparently from the delete above. The root issue appears to be: 1. The eth-ha0 (ring0 interface) interface was not completely up when corosync started. I may be able to do something to try to ensure the interface is completely up… 2. I believe our corosync.conf may need to be tuned (see below). 3. I believe we may need to adjust our /etc/hosts – as the hostname from uname -n maps back to 127.0.0.1 which I think is not what probably works best with corosync. The following is our corosync.conf: totem {
Re: [ClusterLabs] corosync/pacemaker resources start after reboot - incorrect node ID calculated
Erm, in my corosync.conf I got also 'name: the-name-of-the-host' and 'nodeid: ' . I don't see these 2 in your config . Best Regards, Strahil Nikolov В вторник, 28 септември 2021 г., 02:39:20 ч. Гринуич+3, Neitzert, Greg A написа: Hello, We have an issue with a 2 node cluster where both nodes were put into standby (but the resources were not stopped first – so were still in target-role=Started). When the 2 nodes were rebooted, the corosync and pacemaker service started on the first node that came up, but the resources all tried to start, which should not have happened (standby persists through reboots by default). Upon closer inspection, it was found that the system calculated a different node ID than it usually has, and entered the cluster with the same hostname, but not the saved information from the previous cluster ID, so it didn’t remember it was in standby, and tried to come up. I believe the issue is a consequence of two factors. First, the network interface ring0 will use was in the state ‘setup-in-progress’ for some reason when the corosync and pacemaker started. Why exactly that was is still unknown. The corosync systemctl unit should wait until after network-online.target is reached, but that can mean various things, and doesn’t guarantee a particular interface is up. In our case, we use a dedicated network interface with a 169.x.x.x address to connect to the other node. Other interfaces were up, which probably explains why the target was reached. In normal cases, the nodeid calculated by corosync is something like 704514049, which converts to 169.254.8.1 which is the IP address of the ring0 interface. In this particular failing case, that didn’t happen, and it got a nodeid of 2130706433 which converts to 127.0.0.1. On start, the following logs of note were logged: corosync[3965]: [TOTEM ] The network interface is down. [TOTEM ] A new membership (127.0.0.1:4) was formed. Members joined: 2130706433 …. crmd[3978]: notice: Deleting unknown node 704514049/cbsta-mq1 which has conflicting uname with 2130706433 It was the above notice where I believe I lost my saved configuration from the correct node configuration. Here it indicates it is deleting the node that maps to the 169 address and is replacing it with the node id that maps to 127.0.0.1. Then all the various resources try to start on this node, which should not have happened (they should have been in standby). The pengine files verify that they were in standby, but after the new node id was joined, it did not have that setting, and the resources started because the target role was started for the resources before this all happened. It was shortly after that the interface we use for ring0 came up (eth-ha0): eth-ha0: link becomes ready After that the corosync service starts going down: 2021-09-16T00:43:20.022106+01:00 cbsta-mq1 attrd[3976]: notice: crm_update_peer_proc: Node cbsta-mq1[2130706433] - state is now lost (was member) 2021-09-16T00:43:20.022255+01:00 cbsta-mq1 cib[3973]: notice: crm_update_peer_proc: Node cbsta-mq1[2130706433] - state is now lost (was member) 2021-09-16T00:43:20.022373+01:00 cbsta-mq1 attrd[3976]: notice: Removing all cbsta-mq1 attributes for attrd_peer_change_cb 2021-09-16T00:43:20.022524+01:00 cbsta-mq1 cib[3973]: notice: Removing cbsta-mq1/2130706433 from the membership list 2021-09-16T00:43:20.022639+01:00 cbsta-mq1 attrd[3976]: notice: Lost attribute writer cbsta-mq1 2021-09-16T00:43:20.022743+01:00 cbsta-mq1 cib[3973]: notice: Purged 1 peers with id=2130706433 and/or uname=cbsta-mq1 from the membership cache 2021-09-16T00:43:20.022857+01:00 cbsta-mq1 attrd[3976]: notice: Removing cbsta-mq1/2130706433 from the membership list The service then restarts, but now it gets the correct node ID (mapping to 169). 2021-09-16T00:43:20.369715+01:00 cbsta-mq1 corosync[12434]: [TOTEM ] A new membership (169.254.8.1:12) was formed. Members joined: 704514049 2021-09-16T00:43:20.369830+01:00 cbsta-mq1 corosync[12434]: [QUORUM] Members[1]: 704514049 It then tries starting resources again, because it has lost previous information apparently from the delete above. The root issue appears to be: 1. The eth-ha0 (ring0 interface) interface was not completely up when corosync started. I may be able to do something to try to ensure the interface is completely up… 2. I believe our corosync.conf may need to be tuned (see below). 3. I believe we may need to adjust our /etc/hosts – as the hostname from uname -n maps back to 127.0.0.1 which I think is not what probably works best with corosync. The following is our corosync.conf: totem { version: 2 cluster_name: cluster2 clear_node_high_bit: yes crypto_hash: sha1 crypto_cipher: aes256 rrp_mode: active wait_time: 150 # transport: udp interface
Re: [ClusterLabs] Problem with high load (IO)
Hey Ken, how should someone set the maintenace via pcs ? Best Regards,Strahil Nikolov On Mon, Sep 27, 2021 at 19:56, Ken Gaillot wrote: On Mon, 2021-09-27 at 12:37 +0200, Lentes, Bernd wrote: > Hi, > > i have a two-node cluster running on SLES 12SP5 with two HP servers > and a common FC SAN. > Most of my resources are virtual domains offering databases and web > pages. > The disks from the domains reside on a OCFS2 Volume on a FC SAN. > Each night a 9pm all domains are snapshotted with the OCFS2 tool > reflink. > After the snapshot is created the disks of the domains are copied to > a NAS, domains are still running. > The copy procedure occupies the CPU and IO intensively. IO is > occupied by copy about 90%, the CPU has sometimes a wait about 50%. > Because of that the domains aren't responsive, so that the monitor > operation from the RA fails sometimes. > In worst cases one domain is fenced. > What would you do in such a situation ? > I'm thinking of making the cp procedure nicer, with nice. Maybe about > 10. > > More ideas ? > > > Bernd This is a classic use case for rules: https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#using-rules-to-control-cluster-options You can put the cluster into maintenance mode for the window, or disable the monitor for the window. Of course that also disables any cluster response. You could instead lengthen operation timeouts during the window. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Problem with high load (IO)
I would use something liek this: ionice -c 2 -n 7 nice cp XXX YYY Best Regards, Strahil Nikolov В понеделник, 27 септември 2021 г., 13:37:31 ч. Гринуич+3, Lentes, Bernd написа: Hi, i have a two-node cluster running on SLES 12SP5 with two HP servers and a common FC SAN. Most of my resources are virtual domains offering databases and web pages. The disks from the domains reside on a OCFS2 Volume on a FC SAN. Each night a 9pm all domains are snapshotted with the OCFS2 tool reflink. After the snapshot is created the disks of the domains are copied to a NAS, domains are still running. The copy procedure occupies the CPU and IO intensively. IO is occupied by copy about 90%, the CPU has sometimes a wait about 50%. Because of that the domains aren't responsive, so that the monitor operation from the RA fails sometimes. In worst cases one domain is fenced. What would you do in such a situation ? I'm thinking of making the cp procedure nicer, with nice. Maybe about 10. More ideas ? Bernd -- Bernd Lentes System Administrator Institute for Metabolism and Cell Death (MCD) Building 25 - office 122 HelmholtzZentrum München bernd.len...@helmholtz-muenchen.de phone: +49 89 3187 1241 phone: +49 89 3187 3827 fax: +49 89 3187 2294 http://www.helmholtz-muenchen.de/mcd Public key: 30 82 01 0a 02 82 01 01 00 b3 72 3e ce 2c 0a 6f 58 49 2c 92 23 c7 b9 c1 ff 6c 3a 53 be f7 9e e9 24 b7 49 fa 3c e8 de 28 85 2c d3 ed f7 70 03 3f 4d 82 fc cc 96 4f 18 27 1f df 25 b3 13 00 db 4b 1d ec 7f 1b cf f9 cd e8 5b 1f 11 b3 a7 48 f8 c8 37 ed 41 ff 18 9f d7 83 51 a9 bd 86 c2 32 b3 d6 2d 77 ff 32 83 92 67 9e ae ae 9c 99 ce 42 27 6f bf d8 c2 a1 54 fd 2b 6b 12 65 0e 8a 79 56 be 53 89 70 51 02 6a eb 76 b8 92 25 2d 88 aa 57 08 42 ef 57 fb fe 00 71 8e 90 ef b2 e3 22 f3 34 4f 7b f1 c4 b1 7c 2f 1d 6f bd c8 a6 a1 1f 25 f3 e4 4b 6a 23 d3 d2 fa 27 ae 97 80 a3 f0 5a c4 50 4a 45 e3 45 4d 82 9f 8b 87 90 d0 f9 92 2d a7 d2 67 53 e6 ae 1e 72 3e e9 e0 c9 d3 1c 23 e0 75 78 4a 45 60 94 f8 e3 03 0b 09 85 08 d0 6c f3 ff ce fa 50 25 d9 da 81 7b 2a dc 9e 28 8b 83 04 b4 0a 9f 37 b8 ac 58 f1 38 43 0e 72 af 02 03 01 00 01 ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] 8 node cluster
i would go with a VM hosting all resources and setup a 3-node Virtualization cluster. The concept that the cluster should keep your resources up even if another 7 nodes died is not good -> there could be a network issue or other cases where this approach won't (and should not) work. As Antony mentioned -> you need quorum (majority that will agree what is going on) and stonith (a way to prevent the rest of the cluster to take the resources). In your case , you can setup the cluster with last_man_standing and last_man_standing_window and it should work. Are you sure you dodn't drop more than 50% of the nodes simultaneously ? Vest Regards,Strahil Nikolov On Tue, Sep 7, 2021 at 21:08, Antony Stone wrote: On Tuesday 07 September 2021 at 19:37:33, M N S H SNGHL wrote: > I am looking for some suggestions here. I have created an 8 node HA cluster > on my SuSE hosts. An even number of nodes is never a good idea. > 1) The resources should work fine even if 7 nodes go down, which means > surviving node should still be running the resources. > I did set "last_man_standing (and last_man_standing_window) option, with > ATB .. but it didn't really work or didn't dynamically reduce the expected > votes. What do the log files (especially on that "last man") tell you happened as you gradually reduced the number of nodes online? > 2) Another requirement is - If all nodes in the cluster go down, and just > one (anyone) comes back up, it should pick up the resources and should run > them. So, how should this one node realise that it is the only node awake and should be running the reources, and that there aren't {1..7} other nodes somewhere else on the network, all in the same situation, thinking "I can't connect to anyone else, but I'm alive, so I'll take on the resources"? > I tried setting ignore-quorum-policy to ignore, and which worked most of > the time... (yet to find the case where it doesn't work).. but I am > suspecting, wouldn't this setting cause split-brain in some cases? I think you're taking the wrong approach to HA. Some number of nodes (plural) need to be in communication with each other in order for them to decide whether they have quorum or not, and can decide to be in charge of the resources. Two basic rules of HA: 1. One node on its own has no clue whatever else is going on with the rest of the cluster, and therefore cannot decide to take charge 2. Quorum (unless you override it and really know what you're doing) requires >50% of nodes to be in agreement, and an even number of nodes can split into 50:50, where neither half (literally) is >50%, so everything stops. This is "split brain". I have two questions: - why do you feel you need as many as 8 nodes when the resources will only be running on one node? - why do you specifically want 8 nodes instead of 7 or 9? Antony. -- The Royal Society for the Prevention of Cruelty to Animals was formed in 1824. The National Society for the Prevention of Cruelty to Children was not formed until 1884. That says something about the British. Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] (no subject)
In order to test properly, use firewall rules to drop the corosync traffic.I remember that this test (ifdown NKC) was inefficient in previous versions of corosync. If you wish to be more safe, try to setup ocf:pacemaker:ping. Best Regards,Strahil Nikolov On Fri, Sep 3, 2021 at 5:09, 重力加速度 via Users wrote: HELLO! I built a two node corosync + pacemaker cluster, and the main end runs on node0. There are two network ports with the same network segment IP on node0. I created a VIP resource on one of the network ports and specified the NIC attribute. When I down the network port where the VIP resource is located, theoretically, the cluster will automatically switch to another node, but the cluster did not successfully switch to another node. I don't know why, and there is no log information about cluster errors. Is this a bug or do I need to configure some additional parameters?___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Qemu VM resources - cannot acquire state change lock
Are gou using sharding for glusterfs ? I would put libvirt service and glusterfs service in a systemd dependency as your libvirt relies on gluster being available. Also, check if you got 'backup-volfile-servers' mount option if using FUSE.With libgfapi, I got no clue how to configure that. Your setup looksfar close to the oVirt project ... (just mentioning). Best Regards,Strahil Nikolov Sent from Yahoo Mail on Android On Sat, Aug 28, 2021 at 13:33, lejeczek via Users wrote: On 26/08/2021 10:35, Klaus Wenninger wrote: > > > On Thu, Aug 26, 2021 at 11:13 AM lejeczek via Users > mailto:users@clusterlabs.org>> wrote: > > Hi guys. > > I sometimes - I think I know when in terms of any > pattern - > get resources stuck on one node (two-node cluster) with > these in libvirtd's logs: > ... > Cannot start job (query, none, none) for domain > c8kubermaster1; current job is (modify, none, none) > owned by > (192261 qemuProcessReconnect, 0 , 0 > (flags=0x0)) for (1093s, 0s, 0s) > Cannot start job (query, none, none) for domain > ubuntu-tor; > current job is (modify, none, none) owned by (192263 > qemuProcessReconnect, 0 , 0 (flags=0x0)) for > (1093s, 0s, 0s) > Timed out during operation: cannot acquire state > change lock > (held by monitor=qemuProcessReconnect) > Timed out during operation: cannot acquire state > change lock > (held by monitor=qemuProcessReconnect) > ... > > when this happens, and if the resourec is meant to be the > other node, I have to to disable the resource first, then > the node on which resources are stuck will shutdown > the VM > and then I have to re-enable that resource so it > would, only > then, start on that other, the second node. > > I think this problem occurs if I restart 'libvirtd' > via systemd. > > Any thoughts on this guys? > > > What are the logs on the pacemaker-side saying? > An issue with migration? > > Klaus I'll have to try to tidy up the "protocol" with my stuff so I could call it all reproducible, at the moment if only feels that way, as reproducible. I'm on CentOS Stream and have 2-node cluster, with KVM resources, with same glusterfs cluster 2-node. (all psychically is two machines) 1) I power down one node in orderly manner and the other node is last-man-standing. 2) after a while (not sure if time period is also a key here) I brought up that first node. 3) the last man-standing-node libvirtd becomes irresponsive (don't know yet, if that is only after the first node came back up) to virt cmd and to probably everything else, pacameker log says: ... pacemaker-controld[2730]: error: Result of probe operation for c8kubernode2 on dzien: Timed Out ... and libvirtd log does not say anything really (with default debug levels) 4) if glusterfs might play any role? Healing of the volume(s) is finished at this time, completed successfully. This the moment where I would manually 'systemd restart libvirtd' that irresponsive node(was last-man-standing) and got original error messages. There is plenty of room for anybody to make guesses, obvious. Is it 'libvirtd' going haywire because glusterfs volume is in an unhealthy state and needs healing? Is it pacemaker last-man-standing which makes 'libvirtd' go haywire? etc... I can add much concrete stuff at this moment but will appreciate any thoughts you want to share. thanks, L > many thanks, L. > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > <https://lists.clusterlabs.org/mailman/listinfo/users> > > ClusterLabs home: https://www.clusterlabs.org/ > <https://www.clusterlabs.org/> > ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Question about automating cluster unfencing.
You can setup the system in such case that on fabric fence, the node is rebooted which will allow it to 'unfence' itself afterwards. For details check https://access.redhat.com/solutions/3367151 or https://access.redhat.com/node/65187 (You may use RH developer subscription in order to acess it). It seems that fence_mpath has watchdog integration after a certain version, while you can still use /usr/share/cluster/fence_mpath_check (via watchdog service and supported watchdog device). Even if you don't have a proper watchdog device, you can use the 'softdog' module as the system is fenced via SAN and even if not rebooted , there is no risk . Best Regards,Strahil Nikolov Sent from Yahoo Mail on Android On Sat, Aug 28, 2021 at 10:14, Andrei Borzenkov wrote: On Fri, Aug 27, 2021 at 8:11 PM Gerry R Sommerville wrote: > > Hey all, > > From what I see in the documentation for fabric fencing, Pacemaker requires > an administrator to login to the node to manually start and unfence the node > after some failure. > >https://clusterlabs.org/pacemaker/doc/deprecated/en-US/Pacemaker/2.0/html/Pacemaker_Explained/s-unfencing.html > This is about fabric (or resource) fencing. In this case node is cut off from some vital resources but remains up and running. In this case someone indeed needs to intervene manually. > The concern I have is if there is an intermittent network issues, a node may > get fenced and we have to wait for someone to log into the cluster and bring > the node back online. Meanwhile the network issue may have resolved itself > shortly after the node was fenced. > > I wonder if there are any configurations or popular solutions that people use > to automatically unfence nodes and have them rejoin the cluster? > Most people use stonith (or node fencing) and affected node is rebooted. As long as pacemaker is configured to start automatically and network connectivity is restored after reboot node will join custer automatically. I think that in case of fabric fencing node is undefnced automatically when it reboots and attempts to join cluster (hopefully someone may chime in here). I am not sure what happens if node is not rebooted but connectivity is restored. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Cloned ressource is restarted on all nodes if one node fails
> name="statusurl" value="http://localhost/server-status"/> Can you show the apache config for the status page ? It must be accessible only from localhost (127.0.0.1) and should not be reachable from the other nodes. Best Regards, Strahil Nikolov ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Cloned ressource is restarted on all nodes if one node fails
I've setup something similar with VIP that is everywhere using the globally-unique=true (where cluster controls which node to be passive and which active). This allows that the VIP is everywhere but only 1 node answers the requests , while the WEB server was running everywhere with config and data on a shared FS. Sadly, I can't find my notes right now. Best Regards,Strahil Nikolov On Mon, Aug 9, 2021 at 13:43, Andreas Janning wrote: Hi all, we recently experienced an outage in our pacemaker cluster and I would like to understand how we can configure the cluster to avoid this problem in the future. First our basic setup:- CentOS7- Pacemaker 1.1.23- Corosync 2.4.5- Resource-Agents 4.1.1 Our cluster is composed of multiple active/passive nodes. Each software component runs on two nodes simultaneously and all traffic is routed to the active node via Virtual IP.If the active node fails, the passive node grabs the Virtual IP and immediately takes over all work of the failed node. Since the software is already up and running on the passive node, there should be virtually no downtime.We have tried achieved this in pacemaker by configuring clone-sets for each software component. Now the problem:When a software component fails on the active node, the Virtual-IP is correctly grabbed by the passive node. BUT the software component is also immediately restarted on the passive Node.That unfortunately defeats the purpose of the whole setup, since we now have a downtime until the software component is restarted on the passive node and the restart might even fail and lead to a complete outage.After some investigating I now understand that the cloned resource is restarted on all nodes after a monitoring failure because the default "on-fail" of "monitor" is restart. But that is not what I want. I have created a minimal setup that reproduces the problem: http://localhost/server-status"/> When this configuration is started, httpd will be running on active-node and passive-node. The VIP runs only on active-node.When crashing the httpd on active-node (with killall httpd), passive-node immediately grabs the VIP and restarts its own httpd. How can I change this configuration so that when the resource fails on active-node:- passive-node immediately grabs the VIP (as it does now). - active-node tries to restart the failed resource, giving up after x attempts.- passive-node does NOT restart the resource. Regards Andreas Janning -- Beste Arbeitgeber ITK 2021 - 1. Platz für QAware ausgezeichnet von Great Place to Work Andreas Janning Expert Software Engineer QAware GmbH Aschauer Straße 32 81549 München, Germany Mobil +49 160 1492426 andreas.jann...@qaware.de www.qaware.de Geschäftsführer: Christian Kamm, Johannes Weigend, Dr. Josef Adersberger Registergericht: München Handelsregisternummer: HRB 163761 ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Sub‑clusters / super‑clusters - working :)
>Because Asterisk at cityA is bound to a floating IP address, which is held >onone of the three machines in cityA. I can't run Asterisk on all >threemachines there because only one of them has the IP address. That's not true. You can use a cloned IP resource with 'globally-unique=true' which runs the IP everywhere, but the cluster determines which node to respond (conntrolled via IPTABLES) and the others never reply. It's quite useful for reducing the time for failover. Best Regards,Strahil Nikolov___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?
I still can't understand why the whole cluster will fail when only 3 nodes are down and a qdisk is used. CityA -> 3 nodes to run packageA -> 3 votesCityB -> 3 nodes to run packageB -> 3 votesCityC -> 1 node which cannot run any package (qdisk) -> 1 vote Max votes:7Quorum: 4 As long as one city is up + qdisk -> your cluster will be working. Then you just configure that packageA cannot run in CityB, packageB cannot run in CityA.If all nodes in a city die, the relevant package will be down. Last, you configure your last resource without any location constraint. PS: by package consider either a resource group or a single resource. Best Regards,Strahil Nikolov___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?
That's why you need a qdisk at a 3-rd location, so you will have 7 votes in total.When 3 nodes in cityA die, all resources will be started on the remaining 3 nodes. Best Regards,Strahil Nikolov On Wed, Aug 4, 2021 at 17:23, Antony Stone wrote: On Wednesday 04 August 2021 at 16:07:39, Andrei Borzenkov wrote: > On Wed, Aug 4, 2021 at 5:03 PM Antony Stone wrote: > > On Wednesday 04 August 2021 at 13:31:12, Andrei Borzenkov wrote: > > > On Wed, Aug 4, 2021 at 1:48 PM Antony Stone wrote: > > > > On Tuesday 03 August 2021 at 12:12:03, Strahil Nikolov via Users > > > > wrote: > > > > > Won't something like this work ? Each node in LA will have same > > > > > score of 5000, while other cities will be -5000. > > > > > > > > > > pcs constraint location DummyRes1 rule score=5000 city eq LA > > > > > pcs constraint location DummyRes1 rule score=-5000 city ne LA > > > > > stickiness -> 1 > > > > > > > > Thanks for the idea, but no difference. > > > > > > > > Basically, as soon as zero nodes in one city are available, all > > > > resources, including those running perfectly at the other city, stop. > > > > > > That is not what you originally said. > > > > > > You said you have 6 node cluster (3 + 3) and 2 nodes are not available. > > > > No, I don't think I said that? > > "With the new setup, if two machines in city A fail, then _both_ > clusters stop working" Ah, apologies - that was a typo. "With the new setup, if the machines in city A fail, then _both_ clusters stop working". So, basically what I'm saying is that with two separate clusters, if one fails, the other keeps going (as one would expect). Joining the two clusters together so that I can have a single floating resource which can run anywhere (as well as the exact same location-specific resources as before) results in one cluster failure taking the other cluster down too. I need one fully-working 3-node cluster to keep going, no matter what the other cluster does. Antony. -- It is also possible that putting the birds in a laboratory setting inadvertently renders them relatively incompetent. - Daniel C Dennett Please reply to the list; please *don't* CC me. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Moving resource only one way
When you move/migrate resources without the --lifetime option, cluster stack will set +|-INFINITY on the host. (+ -> when migrating to, - -> when migrating away without specifying destination host) Take a look at: https://clusterlabs.org/pacemaker/doc/deprecated/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/_move_resources_manually.html Best Regards,Strahil Nikolov On Tue, Aug 3, 2021 at 22:16, Ervin Hegedüs wrote: Hi, On Tue, Aug 03, 2021 at 05:46:51PM +, Strahil Nikolov via Users wrote: > Yes.INFINITY= 100 (one million)-INFINITY=-100(negative one mill) > Set stickiness > 100 . hmm... it's interesting. I've found the documentation what I made for these systems, but there isn't any line for "location" settings. How did I get it there? I reviwed the configured systems (there are three pairs), and one pair still does not have this line, but two of them have. Thanks, a. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Moving resource only one way
Yes.INFINITY= 100 (one million)-INFINITY=-100(negative one mill) Set stickiness > 100 . Best Regards,Strahil Nikolov > The `location` section overwrites the stickiness? ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?
Won't something like this work ? Each node in LA will have same score of 5000, while other cities will be -5000. pcs constraint location DummyRes1 rule score=5000 city eq LA pcs constraint location DummyRes1 rule score=-5000 city ne LA stickiness -> 1 Best Regards,Strahil Nikolov Out of curiosity: Could one write a rule that demands that a resource migration should (not) happen within the same city? ("should" means "perferably when there are alternatives") ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] pcs add node command is success but node is not configured to existing cluster
Firewall issue ? Did you check on corosync level if all nodes reach each other ? Best Regards, Strahil Nikolov В сряда, 28 юли 2021 г., 16:32:51 ч. Гринуич+3, S Sathish S via Users написа: Hi Team, we are trying to add node03 to existing cluster after adding we could see only 2 nodes are configured and validated corosync log also "Waiting for all cluster members. Current votes: 2 expected_votes: 3" but in node3 pcs cluster status output it show 3 nodes are configured and no resource are listing but in node02 we have 40 resource configured which is not reflecting on node03. This issue occur only on few problematic hardware not on all hardware , we don’t know why this is joining into cluster. [root@node02 ~]# pcs cluster status Cluster Status: Stack: corosync Current DC: node01 (version 2.0.2-744a30d655) - partition WITHOUT quorum Last updated: Wed Jul 28 14:58:13 2021 Last change: Wed Jul 28 14:41:41 2021 by root via cibadmin on node01 2 nodes configured 40 resources configured PCSD Status: node02: Online node01: Online node03: Online [root@node02 ~]# Corosync log on node added execution : Jul 28 11:15:05 [17598] node01 corosync notice [TOTEM ] A new membership (10.216.x.x:42660) was formed. Members Jul 28 11:15:05 [17598] node01 corosync notice [QUORUM] Members[2]: 1 2 Jul 28 11:15:05 [17598] node01 corosync notice [MAIN ] Completed service synchronization, ready to provide service. Jul 28 11:15:05 [17598] node01 corosync notice [CFG ] Config reload requested by node 1 Jul 28 11:15:05 [17598] node01 corosync notice [TOTEM ] adding new UDPU member {10.216.x.x} Jul 28 11:15:07 [17599] node01 corosync notice [VOTEQ ] Waiting for all cluster members. Current votes: 2 expected_votes: 3 Jul 28 11:15:07 [17599] node01 corosync notice [VOTEQ ] Waiting for all cluster members. Current votes: 2 expected_votes: 3 Jul 28 11:15:07 [17599] node01 corosync notice [TOTEM ] A new membership (10.216.x.x:42664) was formed. Members Jul 28 11:15:07 [17599] node01 corosync notice [VOTEQ ] Waiting for all cluster members. Current votes: 2 expected_votes: 3 Jul 28 11:15:07 [17599] node01 corosync notice [VOTEQ ] Waiting for all cluster members. Current votes: 2 expected_votes: 3 Jul 28 11:15:07 [17599] node01 corosync notice [QUORUM] This node is within the non-primary component and will NOT provide any services. Jul 28 11:15:07 [17599] node01 corosync notice [QUORUM] Members[2]: 1 2 Jul 28 11:15:07 [17599] node01 corosync notice [MAIN ] Completed service synchronization, ready to provide service. Jul 28 11:15:07 [17599] node01 corosync notice [VOTEQ ] Waiting for all cluster members. Current votes: 2 expected_votes: 3 Jul 28 11:15:07 [17599] node01 corosync notice [VOTEQ ] Waiting for all cluster members. Current votes: 2 expected_votes: 3 Jul 28 11:15:11 [17599] node01 corosync notice [TOTEM ] A new membership (10.216.x.x:42668) was formed. Members [root@node03 ~]# pcs cluster status Cluster Status: Stack: corosync Current DC: node03 (version 2.0.2-744a30d655) - partition WITHOUT quorum Last updated: Wed Jul 28 15:04:31 2021 Last change: Wed Jul 28 15:04:00 2021 by root via cibadmin on node03 3 nodes configured 0 resources configured PCSD Status: node03: Online node01: Online node02: Online [root@node03 ~]# Thanks and Regards, S Sathish S ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: Re: [EXT] Re: Two node cluster without fencing and no split brain?
So far, I never had a cluster with nodes directly connected to the same switches. Usually it's a nodeA -> switchA -> switchB -> nodeB and sometimes connectivity between switches goes down (for example a firewall rule). Best Regards, Strahil Nikolov В сряда, 28 юли 2021 г., 15:51:36 ч. Гринуич+3, john tillman написа: > Technically you could give one vote to one node and zero to the other. > If they lose contact only the server with one vote would make quorum. > The downside is that if the server with 1 vote goes down the entire > cluster comes to a halt. > > > That said, if both nodes can reach the same switch that they are > connected to each other through, why can't they reach each other? > "... why can't they reach each other?" My question as well. It feels like a very low probability thing to me. Some blockage/filtering/delay of the cluster's "quorum packets" while ping packets were allowed through, perhaps caused by network congestion. But I'm not a network engineer. Any network engineers reading this care to comment? Thanks for echoing my thoughts and that interesting quorum-weight idea. > > On 7/26/21 12:21 PM, john tillman wrote: >> They would continue running their resources and we would have split >> brain. >> >> So there is no safe way to support a two node cluster 100% of the time. >> But when all you have are two nodes and a switch ... well, when life >> gives >> you lemons ... > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Two node cluster without fencing and no split brain?
Hi, consider using a 3rd system as a Q disk. Also, you can use iscsi from that node as a SBD device, so you will have proper fencing .If you don't have a hardware watchdog device, you can use softdog kernel module for that. Best Regards,Strahil Nikolov On Wed, Jul 21, 2021 at 1:45, Digimer wrote: On 2021-07-20 6:04 p.m., john tillman wrote: > Greetings, > > Is it possible to configure a two node cluster (pacemaker 2.0) without > fencing and avoid split brain? No. > I was hoping there was a way to use a 3rd node's ip address, like from a > network switch, as a tie breaker to provide quorum. A simple successful > ping would do it. Quorum is a different concept and doesn't remove the need for fencing. > I realize that this 'ping' approach is not the bullet proof solution that > fencing would provide. However, it may be an improvement over two nodes > alone. It would be, at best, a false sense of security. > Is there a configuration like that already? Any other ideas? > > Pointers to useful documents/discussions on avoiding split brain with two > node clusters would be welcome. https://www.alteeve.com/w/The_2-Node_Myth (note: currently throwing a cert error related to the let's encrypt issue, should be cleared up soon). -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: Re: Antw: [EXT] VIP monitor Timed Out
I think Ulrich was ment the "dirty" buffers like the ones described at https://www.suse.com/support/kb/doc/?id=17857 Based on my experience, you should lower the background dirty tunable as low as possible (let's say 500-600MB) and increase the other tunable at least the double. Keep in mind that you can use either dirty_ratio or dirty_bytes and either dirty_background_ratio or dirty_background_bytes , but never both. Best Regards, Strahil Nikolov В вторник, 20 юли 2021 г., 18:04:36 ч. Гринуич+3, PASERO Florent написа: Thanks Ulrich ! Could you explain me what to do about the tuning of the kernel to limit the amount of dirty buffers ? Br, Florent Classification : Internal -Message d'origine- De : Users De la part de Ulrich Windl Envoyé : mardi 20 juillet 2021 12:02 À : users@clusterlabs.org Objet : [ClusterLabs] Antw: Re: Antw: [EXT] VIP monitor Timed Out Hi! In the commands traced, no command (that is the monitor) too more than 3 seconds, so that either is *not* the timeout, or pacemaker got significantly delayed. One reason I could imagine is a "read stall". For example you could trigger such if you rapidly fill your block cache with dirty blocks (to be written) and some read request would have to wait for buffers (to become written, thus available). However if you are writing like mad, available read buffers might be rare. Fortunately you can tune the kernel to limit the amount of dirty buffers. I'm not saying that is your problem, but the trace looks OK. Regards, Ulrich >>> PASERO Florent schrieb am 20.07.2021 um 11:51 in Nachricht : > Hi, > > Once or twice a week, we have a 'Timed out' on our VIP. > > The last : > Cluster Summary: > * Stack: corosync > * Current DC: server07 (version 2.0.5-9.el8_4.1-ba59be7122) - > partition with quorum > * Last updated: Tue Jul 20 11:39:22 2021 > * Last change: Mon Jul 5 09:42:14 2021 by hacluster via cibadmin > on > server06 > * 2 nodes configured > * 2 resource instances configured > > Node List: > * Online: [ server06 server07 ] > > Active Resources: > * Resource Group: zbx_prod_Web_Core: > * VIP (ocf::heartbeat:IPaddr2): Started server07 > * ZabbixServer (systemd:zabbix-server): Started server07 > > Failed Resource Actions: > * VIP_monitor_1 on server07 'error' (1): call=123, status='Timed > Out', > exitreason='', last-rc-change='2021-07-19 15:02:27 +02:00', > queued=0ms, exec=0ms > > Any idea ? because nothing very revealing in the following log files. > > Here are the monitoring files just before and just after the time out. > > VIP.monitor.2021-07-19.15:01:27 : > +++ 15:01:27: ocf_start_trace:999: echo > +++ 15:01:27: ocf_start_trace:999: printenv > +++ 15:01:27: ocf_start_trace:999: sort > ++ 15:01:27: ocf_start_trace:999: env=' > HA_LOGFACILITY=daemon > HA_LOGFILE=/var/log/pacemaker/pacemaker.log > HA_cluster_type=corosync > HA_debug=0 > HA_logfacility=daemon > HA_logfile=/var/log/pacemaker/pacemaker.log > HA_mcp=true > HA_quorum_type=corosync > INVOCATION_ID=5cd03e610fbf4a9bb3ffe2b30e1fb5d4 > JOURNAL_STREAM=9:4433035 > LC_ALL=C > OCF_EXIT_REASON_PREFIX=ocf-exit-reason: > OCF_RA_VERSION_MAJOR=1 > OCF_RA_VERSION_MINOR=0 > OCF_RESKEY_CRM_meta_interval=1 > OCF_RESKEY_CRM_meta_name=monitor > OCF_RESKEY_CRM_meta_on_node=server07 > OCF_RESKEY_CRM_meta_on_node_uuid=2 > OCF_RESKEY_CRM_meta_timeout=2 > OCF_RESKEY_crm_feature_set=3.7.1 > OCF_RESKEY_ip=10.0.0.67 > OCF_RESKEY_monitor_retries=10 > OCF_RESKEY_trace_file=/apps/Zabbix_Log/Core > OCF_RESKEY_trace_ra=1 > OCF_RESOURCE_INSTANCE=VIP > OCF_RESOURCE_PROVIDER=heartbeat > OCF_RESOURCE_TYPE=IPaddr2 > OCF_ROOT=/usr/lib/ocf > PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/sbin: > /usr/bin:/usr/ucb > PCMK_cluster_type=corosync > PCMK_debug=0 > PCMK_logfacility=daemon > PCMK_logfile=/var/log/pacemaker/pacemaker.log > PCMK_mcp=true > PCMK_quorum_type=corosync > PCMK_service=pacemaker-execd > PCMK_watchdog=false > PWD=/var/lib/pacemaker/cores > SHLVL=1 > VALGRIND_OPTS=--leak-check=full --trace-children=no --vgdb=no > --num-callers=25 --log-file=/var/lib/pacemaker/valgrind-%p > --suppressions=/usr/share/pacemaker/tests/valgrind-pcmk.suppressions > --gen-suppressions=all > _=/usr/bin/printenv > __OCF_TRC_DEST=/var/lib/heartbeat/trace_ra/IPaddr2/VIP.monitor.2021-07-19.15 > :01:27 > __OCF_TRC_MANAGE=1' > ++ 15:01:27: source:1053: ocf_is_true '' > ++ 15:01:27: ocf_is_true:103: case "$1" in > ++ 15:01:27: ocf_is_true:103: case "$1" in > ++ 15:01:27: ocf_is_true:105: false > + 15:01:27: main:69: . /usr/lib/ocf/lib/heartbeat/findif.sh > + 15:01:27: main:72: OCF_RE
Re: [ClusterLabs] Moving resource only one way
Yep, just set the stickiness to something bigger than '0' (max is INFINITY -> 100) Best Regards,Strahil Nikolov On Thu, Jul 15, 2021 at 15:02, Ervin Hegedüs wrote: Hi there, I have to build a very simple cluster with only one resource: a virtual IP. The "challenge":* there are two nodes in the cluster: primary and secondary* if primary node failed, secondary brings up the interface with the virtual IP* BUT: if the primary come back again, the resource must to stay on the secondary Is there any way to solve this? # crm --version crm 4.0.0 Thanks, a. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] unexpected fenced node and promotion of the new master PAF - postgres
If you experience multiple outages, you should consider enabling the kdump feature of sbd. It will increase the takeover time, but might provide valuable info. Best Regards,Strahil Nikolov On Wed, Jul 14, 2021 at 15:12, Klaus Wenninger wrote: ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] QDevice vs 3rd host for majority node quorum
In some cases the third location has a single IP and it makes sense to use it as QDevice. If it has multiple network connections to that location - use a full blown node . Best Regards,Strahil Nikolov On Tue, Jul 13, 2021 at 20:44, Andrei Borzenkov wrote: On 13.07.2021 19:52, Gerry R Sommerville wrote: > Hello everyone, > I am currently comparing using QDevice vs adding a 3rd host to my > even-number-node cluster and I am wondering about the details concerning > network > communication. > For example, say my cluster is utilizing multiple heartbeat rings. Would the > QDevice take into account and use the IPs specified in the different rings? > Or No. > does it only use the one specified under the quorum directive for QDevice? Yes. Remote device is unrelated to corosync rings. Qdevice receives information of current cluster membership from all nodes (point of view), computes partitions and selects partition that will remain quorate. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] @ maillist Admins - DMARC (yahoo)
Actually, I don't mind but it will be nice if I don't get kicked from time to time due to too many bounces. :) Best Regards,Strahil Nikolov On Sat, 2021-07-10 at 12:34 +0100, lejeczek wrote: > Hi Admins(of this mailing list) > > Could you please fix in DMARC(s) so those of us who are on > Yahoo would be able to receive own emails/thread. > > many thanks, L. I suppose we should do something, since this is likely to be more of an issue as time goes on. Unfortunately, it's not as simple as flipping a switch. These are the two reasonable choices: (1) Change the "From" on list messages so that they appear to be from the list, rather than the poster. For example, your posts would show up as "From: lejeczek via ClusterLabs Users " rather than "From: lejeczek ". This is less intrusive but makes it more difficult to reply directly to the sender, add the sender to an address book, etc. (2) Stop adding [ClusterLabs] to subject lines, setting ReplyTo: to the list instead of original author, and adding the list signature. This is more standards-compliant, since the List-* headers can still be used for filtering, unsubscribing, and replying to the list, but not all mail clients make those easy to use. Anyone have preferences for one over the other? (Less reasonable options include wrapping every post in MIME, and disallowing users from DMARC domains to post to the list.) -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] ZFS Opinions?
DRBD is a good option but if you use it with iSCSI - don't forget to add the iSCSI block devices into the lvm filter (client will most probably useit as LVM PV). Another option that I like is GlusterFS. It's easy to deploy and performance scales-out the number of servers. Of course, performance tuning on both is not a trivial task - as any other performance tuning. Best Regards,Strahil Nikolov On Sat, Jul 10, 2021 at 0:12, Eric Robinson wrote: ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] VIP monitor Timed Out
I would try to add 'trace_ra=1' or 'trace_ra=1 trace_file=' to debug it further. In the first option (without trace_file) , the file will be at /var/lib/heartbeat/trace_ra//*timestamp Are you sure that the system is not overloaded and can't respond in time ? Best Regards, Strahil Nikolov В петък, 2 юли 2021 г., 17:53:06 ч. Гринуич+3, PASERO Florent написа: Hi, Once or twice a week, we have a 'Timed out' on our VIP: ~$ pcs status Cluster name: zbx_pprod_Web_Core Cluster Summary: * Stack: corosync * Current DC: #(version 2.0.5-9.el8_4.1-ba59be7122) - partition with quorum * Last updated: Mon Jun 28 16:32:09 2021 * Last change: Mon Jun 14 12:42:57 2021 by root via cibadmin on ## * 2 nodes configured * 2 resource instances configured Node List: * Online: [ # #] Full List of Resources: * Resource Group: zbx_pprod_Web_Core: * VIP (ocf::heartbeat:IPaddr2): Started # * ZabbixServer (systemd:zabbix-server): Started ## Failed Resource Actions: * VIP_monitor_5000 on # 'error' (1): call=69, status='Timed Out', exitreason='', last-rc-change='2021-06-24 14:41:57 +02:00', queued=0ms, exec=0ms * VIP_monitor_5000 on # 'error' (1): call=11, status='Timed Out', exitreason='', last-rc-change='2021-06-17 14:18:20 +02:00', queued=0ms, exec=0ms We have the same issue on two completely different clusters. We can see in the log : Jun 24 14:41:29 # pacemaker-execd [1442069] (child_timeout_callback) warning: VIP_monitor_5000 process (PID 2752333) timed out Jun 24 14:41:34 #pacemaker-execd [1442069] (child_timeout_callback) crit: VIP_monitor_5000 process (PID 2752333) will not die! Jun 24 14:41:57 # pacemaker-execd [1442069] (operation_finished) warning: VIP_monitor_5000[2752333] timed out after 2ms Jun 24 14:41:57 # pacemaker-controld [1442072] (process_lrm_event) error: Result of monitor operation for VIP on #: Timed Out | call=69 key=VIP_monitor_5000 timeout=2ms Jun 24 14:41:57 # pacemaker-based [1442067] (cib_process_request) info: Forwarding cib_modify operation for section status to all (origin=local/crmd/722) Jun 24 14:41:57 # pacemaker-based [1442067] (cib_perform_op) info: Diff: --- 0.54.443 2 Jun 24 14:41:57 # pacemaker-based [1442067] (cib_perform_op) info: Diff: +++ 0.54.444 (null) Jun 24 14:41:57 # pacemaker-based [1442067] (cib_perform_op) info: + /cib: @num_updates=444 Thanks for help Classification : Internal This message and any attachments (the "message") is intended solely for the intended addressees and is confidential. If you receive this message in error,or are not the intended recipient(s), please delete it and any copies from your systems and immediately notify the sender. Any unauthorized view, use that does not comply with its purpose, dissemination or disclosure, either whole or partial, is prohibited. Since the internet cannot guarantee the integrity of this message which may not be reliable, BNP PARIBAS (and its subsidiaries) shall not be liable for the message if modified, changed or falsified. Do not print this message unless it is necessary, consider the environment. -- Ce message et toutes les pieces jointes (ci-apres le "message") sont etablis a l'intention exclusive de ses destinataires et sont confidentiels. Si vous recevez ce message par erreur ou s'il ne vous est pas destine, merci de le detruire ainsi que toute copie de votre systeme et d'en avertir immediatement l'expediteur. Toute lecture non autorisee, toute utilisation de ce message qui n'est pas conforme a sa destination, toute diffusion ou toute publication, totale ou partielle, est interdite. L'Internet ne permettant pas d'assurer l'integrite de ce message electronique susceptible d'alteration, BNP Paribas (et ses filiales) decline(nt) toute responsabilite au titre de ce message dans l'hypothese ou il aurait ete modifie, deforme ou falsifie. N'imprimez ce message que si necessaire, pensez a l'environnement. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Updating quorum configuration without restarting cluster
Also, it's worth mentioning that you can still make changes without downtime. For example you can edit corosync conf and push it to all nodes, then set global maintenance, stop the cluster and then start it again. Best Regards,Strahil Nikolov On Mon, Jun 21, 2021 at 9:37, Jan Friesse wrote: Gerry, > Dear community, > > I would like to ask few questions regarding Corosync/Pacemaker quorum > configuration. > > When updating the Corosync's quorum configuration I added last_man_standing, > and > auto_tie_breaker in corosync.conf on all hosts and refreshed with > 'corosync-cfgtool -R'. > Note that that man page included with the rpm says that the -R option with > "Tell > all instances of corosync in this cluster to reload corosync.conf." > > Next I run 'corosync-quorumtool -s', but it did not show the new quorum flags > for auto tiebreaker and last man standing. > > Once I restarted the corosync cluster, the auto tiebreaker flags and last man > standing flags appeared in the corosync-quorumtool output as I expected. > > So my questions are: > 1. Does corosync-quorumtool actually shows the active quorum configuration? If > not how can I query the active quorum config? Yes, corosync-quorumtool shows quorum configuration which is really used (it's actually only source of truth, cmap is not). > > 2. Is it possible to update the quorum configuration without restarting the > cluster? Partly. Basically only quorum.two_node and quorum.expected_votes are changeable during runtime. Other options like: - quorum.allow_downscale - quorum.wait_for_all - quorum.last_man_standing - quorum.auto_tie_breaker - quorum.auto_tie_breaker_node are not (wait_for_all is a little bit more complicated - when not explicitly set/unset it follows two_node so it is possible, but only in this special case, to change it via changing two_node). Regards, Honza btw. I've already replied to Janghyuk Boo so mostly copying same answer also here. > > Thank you, > Gerry Sommerville > E-mail: ge...@ca.ibm.com <mailto:ge...@ca.ibm.com> > > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Updating quorum configuration without restarting cluster
You can reload corosync via 'pcs' and I think that both are supported.The main question is if you did reload corosync on all nodes in the cluster ? Best Regards,Strahil Nikolov On Sat, Jun 19, 2021 at 1:22, Gerry R Sommerville wrote: Dear community, I would like to ask few questions regarding Corosync/Pacemaker quorum configuration. When updating the Corosync's quorum configuration I added last_man_standing, and auto_tie_breaker in corosync.conf on all hosts and refreshed with 'corosync-cfgtool -R'. Note that that man page included with the rpm says that the -R option with "Tell all instances of corosync in this cluster to reload corosync.conf." Next I run 'corosync-quorumtool -s', but it did not show the new quorum flags for auto tiebreaker and last man standing. Once I restarted the corosync cluster, the auto tiebreaker flags and last man standing flags appeared in the corosync-quorumtool output as I expected. So my questions are: 1. Does corosync-quorumtool actually shows the active quorum configuration? If not how can I query the active quorum config? 2. Is it possible to update the quorum configuration without restarting the cluster? Thank you,Gerry SommervilleE-mail: ge...@ca.ibm.com ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Correctly stop pacemaker on 2-node cluster with SBD and failed devices?
Maybe you can try: while true ; do echo '0' > /proc/sys/kernel/nmi_watchdog ; sleep 1 ; done and in another shell stop pacemaker and sbd. I guess the only way to easily reproduce is with sbd over iscsi. Best Regards,Strahil Nikolov On Tue, Jun 15, 2021 at 21:30, Andrei Borzenkov wrote: On 15.06.2021 20:48, Strahil Nikolov wrote: > I'm using 'pcs cluster stop' (or it's crm alternative),yet I'm not sure if it > will help in this case. > No it won't. It will still stop pacemaker. > Most probably the safest way is to wait for the storage to be recovered, as > without the pacemaker<->SBD communication , sbd will stop and the watchdog > will be triggered. > What makes you think I am not aware of it? can you suggest the steps to avoid it? ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Correctly stop pacemaker on 2-node cluster with SBD and failed devices?
I'm using 'pcs cluster stop' (or it's crm alternative),yet I'm not sure if it will help in this case. Most probably the safest way is to wait for the storage to be recovered, as without the pacemaker<->SBD communication , sbd will stop and the watchdog will be triggered. Best Regards, Strahil Nikolov В вторник, 15 юни 2021 г., 18:47:06 ч. Гринуич+3, Andrei Borzenkov написа: On Tue, Jun 15, 2021 at 6:43 PM Strahil Nikolov wrote: > > How did you stop pacemaker ? systemctl stop pacemaker surprise :) > Usually I use 'pcs cluster stop' or it's crm alternative. > > Best Regards, > Strahil Nikolov > > On Tue, Jun 15, 2021 at 18:21, Andrei Borzenkov > wrote: > We had the following situation > > 2-node cluster with single device (just single external storage > available). Storage failed. So SBD lost access to the device. Cluster > was still up, both nodes were running. > > We thought that access to storage was restored, but one step was > missing so devices appeared empty. > > At this point I tried to restart the pacemaker. But as soon as I > stopped pacemaker SBD rebooted nodes - which is logical, as quorum was > now lost. > > How to cleanly stop pacemaker in this case and keep nodes up? > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Correctly stop pacemaker on 2-node cluster with SBD and failed devices?
How did you stop pacemaker ?Usually I use 'pcs cluster stop' or it's crm alternative. Best Regards,Strahil Nikolov On Tue, Jun 15, 2021 at 18:21, Andrei Borzenkov wrote: We had the following situation 2-node cluster with single device (just single external storage available). Storage failed. So SBD lost access to the device. Cluster was still up, both nodes were running. We thought that access to storage was restored, but one step was missing so devices appeared empty. At this point I tried to restart the pacemaker. But as soon as I stopped pacemaker SBD rebooted nodes - which is logical, as quorum was now lost. How to cleanly stop pacemaker in this case and keep nodes up? ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Antw: Hanging OCFS2 Filesystem any one else?
Thanks for the update. Could it be something local to your environment ? Have you checked mounting the OCFS2 on a vanilla system ? Best Regards,Strahil Nikolov On Tue, Jun 15, 2021 at 12:01, Ulrich Windl wrote: Hi Guys! Just to keep you informed on the issue: I was informed that I'm not the only one seeing this problem, and there seems to be some "negative interference" between BtrFS reorganizing its extents periodically and OCFS2 making reflink snapshots (a local cron job here) in current SUSE SLES kernels. It seems that happens almost exactly at 0:00 o' clock. The only thing that BtrFS and OCFS2 have in common here is that BtrFS provides the mount point for OCFS2. Regards, Ulrich >>> Ulrich Windl schrieb am 02.06.2021 um 11:00 in Nachricht <60B748A4.E0C : 161 : 60728>: >>>> Gang He schrieb am 02.06.2021 um 08:34 in Nachricht > om> > > > Hi Ulrich, > > > > The hang problem looks like a fix (90bd070aae6c4fb5d302f9c4b9c88be60c8197ec > > ocfs2: fix deadlock between setattr and dio_end_io_write), but it is not > 100% > > sure. > > If possible, could you help to report a bug to SUSE, then we can work on > > that further. > > Hi! > > Actually a service request for the issue is open at SUSE. However I don't > know which L3 engineer is working on it. > I have some "funny" effects, like these: > On one node "ls" hangs, but can be interrupted with ^C; on another node "ls" > also hangs, but cannot be stopped with ^C or ^Z > (Most processes cannot even be killed with "kill -9") > "ls" on the directory also hangs, just as an "rm" for a non-existent file > > What I really wonder is what triggered the effect, and more importantly how > to recover from it. > Initially I had suspected a rather full (95%) flesystem, but that means > there are still 24GB available. > The other suspect was concurrent creation of reflink snapshots while the > file being snapshot did change (e.g. allocate a hole in a sparse file) > > Regards, > Ulrich > > > > > Thanks > > Gang > > > > > > From: Users on behalf of Ulrich Windl > > > > Sent: Tuesday, June 1, 2021 15:14 > > To: users@clusterlabs.org > > Subject: [ClusterLabs] Antw: Hanging OCFS2 Filesystem any one else? > > > >>>> Ulrich Windl schrieb am 31.05.2021 um 12:11 in Nachricht <60B4B65A.A8F : 161 > > > : > > 60728>: > >> Hi! > >> > >> We have an OCFS2 filesystem shared between three cluster nodes (SLES 15 SP2, > >> Kernel 5.3.18‑24.64‑default). The filesystem is filled up to about 95%, and > >> we have an odd effect: > >> A stat() systemcall to some of the files hangs indefinitely (state "D"). > >> ("ls ‑l" and "rm" also hang, but I suspect those are calling state() > >> internally, too). > >> My only suspect is that the effect might be related to the 95% being used. > >> The other suspect is that concurrent reflink calls may trigger the effect. > >> > >> Did anyone else experience something similar? > > > > Hi! > > > > I have some details: > > It seems there is a reader/writer deadlock trying to allocate additional > > blocks for a file. > > The stacktrace looks like this: > > Jun 01 07:56:31 h16 kernel: rwsem_down_write_slowpath+0x251/0x620 > > Jun 01 07:56:31 h16 kernel: ? __ocfs2_change_file_space+0xb3/0x620 [ocfs2] > > Jun 01 07:56:31 h16 kernel: __ocfs2_change_file_space+0xb3/0x620 [ocfs2] > > Jun 01 07:56:31 h16 kernel: ocfs2_fallocate+0x82/0xa0 [ocfs2] > > Jun 01 07:56:31 h16 kernel: vfs_fallocate+0x13f/0x2a0 > > Jun 01 07:56:31 h16 kernel: ksys_fallocate+0x3c/0x70 > > Jun 01 07:56:31 h16 kernel: __x64_sys_fallocate+0x1a/0x20 > > Jun 01 07:56:31 h16 kernel: do_syscall_64+0x5b/0x1e0 > > > > That is the only writer (on that host), bit there are multiple readers like > > this: > > Jun 01 07:56:31 h16 kernel: rwsem_down_read_slowpath+0x172/0x300 > > Jun 01 07:56:31 h16 kernel: ? dput+0x2c/0x2f0 > > Jun 01 07:56:31 h16 kernel: ? lookup_slow+0x27/0x50 > > Jun 01 07:56:31 h16 kernel: lookup_slow+0x27/0x50 > > Jun 01 07:56:31 h16 kernel: walk_component+0x1c4/0x300 > > Jun 01 07:56:31 h16 kernel: ? path_init+0x192/0x320 > > Jun 01 07:56:31 h16 kernel: path_lookupat+0x6e/0x210 > > Jun 01 07:56:31 h16 kernel: ? __put_lkb+0x45/0xd0 [dlm] > > Jun 01 07:56:31 h16 kernel: filename_lookup+0xb6/0x190 > > Jun 01 07:56:31 h16 kernel: ? kmem_cache_alloc+0x3d/0x250 > > Jun 01 0
Re: [ClusterLabs] A systemd resource monitor is still in progress: re-scheduling
Did you notice any delay in 'systemctl status openstack-cinder-scheduler' ? As far as I know the cluster will use systemd (or even maybe dbus) to get the info of the service. Also, 10s monitor intercal seems quite aggressive - have you considered increasing that ? Best Regards,Strahil Nikolov On Sun, Jun 13, 2021 at 17:45, Acewind wrote: Dear guys, I'm using pacemaker-1.1.20 to construct an openstack HA system. After I stop the cluster, pcs monitor operation always be in progress for cinder-volume & cinder-scheduler service. But the systemd service is active and openstack is working well. How does pacemaker monitor a normal systemd resource? > pcs resource show r_systemd_openstack-cinder-scheduler Resource: r_systemd_openstack-cinder-scheduler (class=systemd type=openstack-cinder-scheduler) Operations: monitor interval=10s timeout=100s (r_systemd_openstack-cinder-scheduler-monitor-interval-10s) stop interval=0s timeout=100s (r_systemd_openstack-cinder-scheduler-stop-interval-0s) 2021-06-13 20:50:42 pcs cluster stop --all2021-06-13 20:50:56 pcs cluster start --all Jun 13 20:53:16 [4057851] host001 lrmd: info: action_complete: r_systemd_openstack-cinder-scheduler monitor is still in progress: re-scheduling (elapsed=54372ms, remaining=45628ms, start_delay=2000ms) Jun 13 20:53:18 [4057851] host001 lrmd: info: action_complete: r_systemd_openstack-cinder-scheduler monitor is still in progress: re-scheduling (elapsed=56374ms, remaining=43626ms, start_delay=2000ms) Jun 13 20:53:20 [4057851] host001 lrmd: info: action_complete: r_systemd_openstack-cinder-scheduler monitor is still in progress: re-scheduling (elapsed=58375ms, remaining=41625ms, start_delay=2000ms) Jun 13 20:53:22 [4057854] host001 crmd: notice: process_lrm_event: Result of stop operation for r_systemd_openstack-cinder-scheduler on host001: 0 (ok) | call=71 key=r_systemd_openstack-cinder-scheduler_stop_0 confirmed=true cib-update=59 The whole log file is included in attachment. Thanks! ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] One Failed Resource = Failover the Cluster?
Based on the constraint rules you have mentioned , failure of mysql should not cause a failover to another node. For better insight, you have to be able to reproduce the issue and share the logs with the community. Best Regards,Strahil Nikolov On Sat, Jun 5, 2021 at 23:33, Eric Robinson wrote: > -Original Message- > From: Users On Behalf Of > kgail...@redhat.com > Sent: Friday, June 4, 2021 4:49 PM > To: Cluster Labs - All topics related to open-source clustering welcomed > > Subject: Re: [ClusterLabs] One Failed Resource = Failover the Cluster? > > On Fri, 2021-06-04 at 19:10 +, Eric Robinson wrote: > > Sometimes it seems like Pacemaker fails over an entire cluster when > > only one resource has failed, even though no other resources are > > dependent on it. Is that expected behavior? > > > > For example, suppose I have the following colocation constraints… > > > > filesystem with drbd master > > vip with filesystem > > mysql_01 with filesystem > > mysql_02 with filesystem > > mysql_03 with filesystem > > By default, a resource that is colocated with another resource will influence > that resource's location. This ensures that as many resources are active as > possible. > > So, if any one of the above resources fails and meets its migration- > threshold, > all of the resources will move to another node so a recovery attempt can be > made for the failed resource. > > No resource will be *stopped* due to the failed resource unless it depends > on it. > Thanks, but I'm confused by your previous two paragraphs. On one hand, "if any one of the above resources fails and meets its migration- threshold, all of the resources will move to another node." Obviously moving resources requires stopping them. But then, "No resource will be *stopped* due to the failed resource unless it depends on it." Those two statements seem contradictory to me. Not trying to be argumentative. Just trying to understand. > As of the forthcoming 2.1.0 release, the new "influence" option for > colocation constraints (and "critical" resource meta-attribute) controls > whether this effect occurs. If influence is turned off (or the resource made > non-critical), then the failed resource will just stop, and the other > resources > won't move to try to save it. > That sounds like the feature I'm waiting for. In the example configuration I provided, I would not want the failure of any mysql instance to cause cluster failover. I would only want the cluster to failover if the filesystem or drbd resources failed. Basically, if a resource breaks or fails to stop, I don't want the whole cluster to failover if nothing depends on that resource. Just let it stay down until someone can manually intervene. But if an underlying resource fails that everything else is dependent on (drbd or filesystem) then go ahead and failover the cluster. > > > > …and the following order constraints… > > > > promote drbd, then start filesystem > > start filesystem, then start vip > > start filesystem, then start mysql_01 > > start filesystem, then start mysql_02 > > start filesystem, then start mysql_03 > > > > Now, if something goes wrong with mysql_02, will Pacemaker try to fail > > over the whole cluster? And if mysql_02 can’t be run on either > > cluster, then does Pacemaker refuse to run any resources? > > > > I’m asking because I’ve seen some odd behavior like that over the > > years. Could be my own configuration mistakes, of course. > > > > -Eric > -- > Ken Gaillot > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ Disclaimer : This email and any files transmitted with it are confidential and intended solely for intended recipients. If you are not the named addressee you should not disseminate, distribute, copy or alter this email. Any views or opinions presented in this email are solely those of the author and might not represent those of Physician Select Management. Warning: Although Physician Select Management has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] One Failed Resource = Failover the Cluster?
It shouldn't relocate or affect any other resource,as long as the stop succeeds.If the stop operation times out or fails -> fencing kicks in. Best Regards,Strahil Nikolov___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Cluster Stopped, No Messages?
Did you configure pacemaker blackbox ? If not, it could be valuable in such cases. Also consider updating as soon as possible. Most probably nobody can count the bug fixes that were introduced between 7.5 and 7.9, nor anyone will be able to help as you are running a pretty outdated version (even by RH standards). Best Regards,Strahil Nikolov___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Cluster Stopped, No Messages?
I agree -> fencing is mandatory. You can enable the debug logs by editing corosync.conf or /etc/sysconfig/pacemaker. In case simple reload doesn't work, you can set the cluster in global maintenance, stop and then start the stack. Best Regards,Strahil Nikolov On Fri, May 28, 2021 at 22:13, Digimer wrote: On 2021-05-28 3:08 p.m., Eric Robinson wrote: > >> -Original Message- >> From: Digimer >> Sent: Friday, May 28, 2021 12:43 PM >> To: Cluster Labs - All topics related to open-source clustering welcomed >> ; Eric Robinson ; Strahil >> Nikolov >> Subject: Re: [ClusterLabs] Cluster Stopped, No Messages? >> >> Shared storage is not what triggers the need for fencing. Coordinating >> actions >> is what triggers the need. Specifically; If you can run resource on both/all >> nodes at the same time, you don't need HA. If you can't, you need fencing. >> >> Digimer > > Thanks. That said, there is no fencing, so any thoughts on why the node > behaved the way it did? Without fencing, when a communication or membership issues arises, it's hard to predict what will happen. I don't see anything in the short log snippet to indicate what happened. What's on the other node during the event? When did the node disappear and when was it rejoined, to help find relevant log entries? Going forward, if you want predictable and reliable operation, implement fencing asap. Fencing is required. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Pacemaker not issuing start command intermittently
Most RA scripts are writen in bash.Usually you can change the shebang to '!#/usr/bin/bash -x' or you can set trace_ra=1 via 'pcs resource update RESOURCE trace_ra=1 trace_file=/somepath'. If you don't define trace_file, it should create them in /var/lib/heartbeat/trace_ra (based on memory -> so use find/locate). Best Regards,Strahil Nikolov On Fri, May 28, 2021 at 22:10, Abithan Kumarasamy wrote: Hello Team, We have been recently running some tests on our Pacemaker clusters that involve two Pacemaker resources on two nodes respectively. The test case in which we are experiencing intermittent problems is one in which we bring down the Pacemaker resources on both nodes simultaneously. Now our expected behaviour is that our monitor function in our resource agent script detects the downtime, and then should issue a start command. This happens on most successful iterations of our test case. However, on some iterations (approximately 1 out of 30 simulations) we notice that Pacemaker is issuing the start command on only one of the hosts. On the troubled host the monitor function is logging that the resource is down as expected and is exiting with OCF_ERR_GENERIC return code (1) . According to the documentation, this should perform a soft disaster recovery, but when scanning the Pacemaker logs, there is no indication of the start command being issued or invoked. However, it works as expected on the other host. To summarize the issue: - The resource’s monitor is running and returning OCF_ERR_GENERIC - The constraints we have for the resources are satisfied. - There are no visible differences in the Pacemaker logs between the test iteration that failed, and the multiple successful iterations, other than the fact that Pacemaker does not start the resource after the monitor returns OCF_ERR_GENERIC Could you provide some more insight into why this may be happening and how we can further debug this issue? We are currently relying on Pacemaker logs, but are there additional diagnostics to further debug? Thanks,Abithan ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Cluster Stopped, No Messages?
what is your fencing agent ? Best Regards,Strahil Nikolov On Thu, May 27, 2021 at 20:52, Eric Robinson wrote: We found one of our cluster nodes down this morning. The server was up but cluster services were not running. Upon examination of the logs, we found that the cluster just stopped around 9:40:31 and then I started it up manually (pcs cluster start) at 11:49:48. I can’t imagine that Pacemaker just randomly terminates. Any thoughts why it would behave this way? May 27 09:25:31 [92170] 001store01a pengine: notice: process_pe_message: Calculated transition 91482, saving inputs in /var/lib/pacemaker/pengine/pe-input-756.bz2 May 27 09:25:31 [92171] 001store01a crmd: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE | input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response May 27 09:25:31 [92171] 001store01a crmd: info: do_te_invoke: Processing graph 91482 (ref=pe_calc-dc-1622121931-124396) derived from /var/lib/pacemaker/pengine/pe-input-756.bz2 May 27 09:25:31 [92171] 001store01a crmd: notice: run_graph: Transition 91482 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-756.bz2): Complete May 27 09:25:31 [92171] 001store01a crmd: info: do_log: Input I_TE_SUCCESS received in state S_TRANSITION_ENGINE from notify_crmd May 27 09:25:31 [92171] 001store01a crmd: notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE | input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd May 27 09:40:31 [92171] 001store01a crmd: info: crm_timer_popped: PEngine Recheck Timer (I_PE_CALC) just popped (90ms) May 27 09:40:31 [92171] 001store01a crmd: notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE | input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped May 27 09:40:31 [92171] 001store01a crmd: info: do_state_transition: Progressed to state S_POLICY_ENGINE after C_TIMER_POPPED May 27 09:40:31 [92170] 001store01a pengine: info: process_pe_message: Input has not changed since last time, not saving to disk May 27 09:40:31 [92170] 001store01a pengine: info: determine_online_status: Node 001store01a is online May 27 09:40:31 [92170] 001store01a pengine: info: determine_op_status: Operation monitor found resource p_pure-ftpd-itls active on 001store01a May 27 09:40:31 [92170] 001store01a pengine: warning: unpack_rsc_op_failure: Processing failed op monitor for p_vip_ftpclust01 on 001store01a: unknown error (1) May 27 09:40:31 [92170] 001store01a pengine: info: determine_op_status: Operation monitor found resource p_pure-ftpd-etls active on 001store01a May 27 09:40:31 [92170] 001store01a pengine: info: unpack_node_loop: Node 1 is already processed May 27 09:40:31 [92170] 001store01a pengine: info: unpack_node_loop: Node 1 is already processed May 27 09:40:31 [92170] 001store01a pengine: info: common_print: p_vip_ftpclust01 (ocf::heartbeat:IPaddr2): Started 001store01a May 27 09:40:31 [92170] 001store01a pengine: info: common_print: p_replicator (systemd:pure-replicator): Started 001store01a May 27 09:40:31 [92170] 001store01a pengine: info: common_print: p_pure-ftpd-etls (systemd:pure-ftpd-etls): Started 001store01a May 27 09:40:31 [92170] 001store01a pengine: info: common_print: p_pure-ftpd-itls (systemd:pure-ftpd-itls): Started 001store01a May 27 09:40:31 [92170] 001store01a pengine: info: LogActions: Leave p_vip_ftpclust01 (Started 001store01a) May 27 09:40:31 [92170] 001store01a pengine: info: LogActions: Leave p_replicator (Started 001store01a) May 27 09:40:31 [92170] 001store01a pengine: info: LogActions: Leave p_pure-ftpd-etls (Started 001store01a) May 27 09:40:31 [92170] 001store01a pengine: info: LogActions: Leave p_pure-ftpd-itls (Started 001store01a) May 27 09:40:31 [92170] 001store01a pengine: notice: process_pe_message: Calculated transition 91483, saving inputs in /var/lib/pacemaker/pengine/pe-input-756.bz2 May 27 09:40:31 [92171] 001store01a crmd: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE | input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response May 27 09:40:31 [92171] 001store01a crmd: info: do_te_invoke: Processing graph 91483 (ref=pe_calc-dc-1622122831-124397) derived from /var/lib/pacemaker/pengine/pe-input-756.bz2 May 27 09:40:31 [92171] 001store01a crmd: notice: run_graph: Transition 91483 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-756.bz2): Complete May 27 09:40:31 [92171] 001store01
Re: [ClusterLabs] OCFS2 fragmentation with snapshots
Are you using KVM ? Maybe you can create a snapshot on VM level and then defragfs.ocfs2 the read-only part of the VM disk file and after the defrag -> merge it back by deleting the snapshot ? Yet, the whole idea seems wrong to me. I would freeze the FS and the application in the VM and then make a snapshot via your Virtualization tech stack. Best Regards,Strahil Nikolov On Tue, May 18, 2021 at 13:52, Ulrich Windl wrote: Hi! I thought using the reflink feature of OCFS2 would be just a nice way to make crash-consistent VM snapshots while they are running. As it is a bit tricky to find out how much data is shared between snapshots, I started to write an utility to examine the blocks allocated to the VM backing files and snapshots. Unfortunately (as it seems) OCFS2 fragments terribly under reflink snapshots. Here is an example of a rather "good" file: It has 85 extents that are rather large (not that the extents are sorted by first block; in reality it's a bit worse): DEBUG(5): update_stats: blk_list[0]: 3551627-3551632 (6, 0x2000) DEBUG(5): update_stats: blk_list[1]: 3553626-3556978 (3353, 0x2000) DEBUG(5): update_stats: blk_list[2]: 16777217-16780688 (3472, 0x2000) DEBUG(5): update_stats: blk_list[3]: 16780689-16792832 (12144, 0x2000) DEBUG(5): update_stats: blk_list[4]: 17301147-17304618 (3472, 0x2000) DEBUG(5): update_stats: blk_list[5]: 17304619-17316762 (12144, 0x2000) ... DEBUG(5): update_stats: blk_list[81]: 31178385-31190528 (12144, 0x2000) DEBUG(5): update_stats: blk_list[82]: 31191553-31195024 (3472, 0x2000) DEBUG(5): update_stats: blk_list[83]: 31195025-31207168 (12144, 0x2000) DEBUG(5): update_stats: blk_list[84]: 31210641-31222385 (11745, 0x2001) filesystem: 655360 blocks of size 16384 655360 (100%) blocks type 0x2000 (shared) And here's a terrible example (33837 extents): DEBUG(4): finalize_blockstats: blk_list[0]: 257778-257841 (64, 0x2000) DEBUG(4): finalize_blockstats: blk_list[1]: 257842-257905 (64, 0x2000) DEBUG(4): finalize_blockstats: blk_list[2]: 263503-263513 (11, 0x2000) DEBUG(4): finalize_blockstats: blk_list[3]: 263558-263558 (1, 0x2000) DEBUG(4): finalize_blockstats: blk_list[4]: 263559-263569 (11, 0x2000) DEBUG(4): finalize_blockstats: blk_list[5]: 263587-263587 (1, 0x2000) DEBUG(4): finalize_blockstats: blk_list[6]: 263597-263610 (14, 0x2000) DEBUG(4): finalize_blockstats: blk_list[7]: 270414-270415 (2, 0x2000) ... DEBUG(4): finalize_blockstats: blk_list[90]: 382214-382406 (193, 0x2000) DEBUG(4): finalize_blockstats: blk_list[91]: 382791-382918 (128, 0x2000) DEBUG(4): finalize_blockstats: blk_list[92]: 382983-382990 (8, 0x2000) DEBUG(4): finalize_blockstats: blk_list[93]: 383520-383522 (3, 0x2000) DEBUG(4): finalize_blockstats: blk_list[94]: 384672-384692 (21, 0x2000) DEBUG(4): finalize_blockstats: blk_list[95]: 384860-384918 (59, 0x2000) DEBUG(4): finalize_blockstats: blk_list[96]: 385088-385089 (2, 0x2000) DEBUG(4): finalize_blockstats: blk_list[97]: 385090-385091 (2, 0x2000) ... DEBUG(4): finalize_blockstats: blk_list[805]: 2769213-2769213 (1, 0x2000) DEBUG(4): finalize_blockstats: blk_list[806]: 2769214-2769214 (1, 0x2000) DEBUG(4): finalize_blockstats: blk_list[807]: 2769259-2769259 (1, 0x2000) DEBUG(4): finalize_blockstats: blk_list[808]: 2769261-2769261 (1, 0x2000) DEBUG(4): finalize_blockstats: blk_list[809]: 2769314-2769314 (1, 0x2000) DEBUG(4): finalize_blockstats: blk_list[810]: 2772041-2772042 (2, 0x2000) DEBUG(4): finalize_blockstats: blk_list[811]: 2772076-2772076 (1, 0x2000) DEBUG(4): finalize_blockstats: blk_list[812]: 2772078-2772078 (1, 0x2000) DEBUG(4): finalize_blockstats: blk_list[813]: 2772079-2772080 (2, 0x2000) DEBUG(4): finalize_blockstats: blk_list[814]: 2772096-2772096 (1, 0x2000) DEBUG(4): finalize_blockstats: blk_list[815]: 2772099-2772099 (1, 0x2000) ... DEBUG(4): finalize_blockstats: blk_list[33829]: 39317682-39317704 (23, 0x2000) DEBUG(4): finalize_blockstats: blk_list[33830]: 39317770-39317775 (6, 0x2000) DEBUG(4): finalize_blockstats: blk_list[33831]: 39318022-39318045 (24, 0x2000) DEBUG(4): finalize_blockstats: blk_list[33832]: 39318274-39318284 (11, 0x2000) DEBUG(4): finalize_blockstats: blk_list[33833]: 39318327-39318344 (18, 0x2000) DEBUG(4): finalize_blockstats: blk_list[33834]: 39319157-39319166 (10, 0x2000) DEBUG(4): finalize_blockstats: blk_list[33835]: 39319172-39319184 (13, 0x2000) DEBUG(4): finalize_blockstats: blk_list[33836]: 39319896-39319936 (41, 0x2000) filesystem: 1966076 blocks of size 16384 mapped=1121733 (57%) 1007658 (51%) blocks type 0x2000 (shared) 114075 (6%) blocks type 0x2800 (unwritten|shared) So I wonder (while understanding the principle of copy-on-write for reflink snapshots): Is there a way to avoid or undo the fragmentation? Regards, Ulrich ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ ___ Manage your
Re: [ClusterLabs] DRBD + VDO HowTo?
Also,pacemaker has a very fine grain control mechanisms when and where to run your resources (and even with which resourses to colocate). Best Regards,Strahil Nikolov On Tue, May 18, 2021 at 12:43, Strahil Nikolov wrote: >That was the first thing I tried. The systemd service does not work because it wants to stop and start all vdo devices, but mine are on different nodes. That's why I mentioned to create your own version of the systemd service. Best Regards,Strahil Nikolov ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/