Re: [ClusterLabs] Recommendation Fencing

2024-09-02 Thread Klaus Wenninger
On Sat, Aug 31, 2024 at 5:13 PM Angelo M Ruggiero via Users < users@clusterlabs.org> wrote: > Hi, > > Thanks for the previous replies. I am currently reading them. > > Can i be cheeky i have been researching and having some other > "organisation issues" around fencing. > > Maybe it is possible to

Re: [ClusterLabs] Batch start (approx > 5) of systemd resources fail (exitreason='inactive')

2024-08-28 Thread Klaus Wenninger
On Wed, Aug 28, 2024 at 8:04 PM Ken Gaillot wrote: > Thanks for the report, and sorry for the slow response. > > There is a longstanding goal to improve systemd resource monitoring by > using DBus signals instead of polling the status: > > https://projects.clusterlabs.org/T25 > > There's a good

Re: [ClusterLabs] Resource stop sequence with massive CIB update

2024-08-16 Thread Klaus Wenninger
On Tue, Aug 13, 2024 at 12:16 AM wrote: > Hi Ken, > > thank you great for your prompt help! > > > The second one: --sync-call only waits for the change to be committed to > the CIB, > > not for the cluster to respond. For that, call crm_resource --wait > afterward. > > It seems like crm_resource

Re: [ClusterLabs] Need advice: deep pacemaker integration, best approach?

2024-06-10 Thread Klaus Wenninger
On Mon, Jun 10, 2024 at 6:12 PM Ken Gaillot wrote: > On Sun, 2024-06-09 at 23:13 +0300, ale...@pavlyuts.ru wrote: > > Hi All, > > > > We intend to integrate Pacemaker as failover engine into a very > > specific product. The handmade prototype works pretty well. It > > includes a couple of dozens

Re: [ClusterLabs] Strange behavior of Resource stickiness

2024-05-28 Thread Klaus Wenninger
On Tue, May 28, 2024 at 12:34 PM Александр Руденко wrote: > Andrei, thank you! > > I tried to find node's scores and have found location constraints for > these 3 resources: > > pcs constraint > Location Constraints: > Resource: fsmt-28085F00 > Enabled on: > Node: vdc16 (score:INFINIT

Re: [ClusterLabs] Strange behavior of Resource stickiness

2024-05-28 Thread Klaus Wenninger
On Tue, May 28, 2024 at 10:40 AM Александр Руденко wrote: > Hi! > > I can't understand this strange behavior, help me please. > > I have 3 nodes in my cluster, 4 vCPU/8GB RAM each. And about 70 groups, 2 > resources in each group. First one resource is our custom resource which > configures Linux

Re: [ClusterLabs] crm services not getting started after upgrading to snmp40

2024-05-23 Thread Klaus Wenninger
On Wed, May 22, 2024 at 5:16 PM ., Anoop wrote: > What is the "certain filesystem"? If cluster services require it, that > would explain why they can't start. > - Here we have btrfs and xfs filesystems. Yes cluster services require > these filesystem to be mounted. > What do the systemd journal

Re: [ClusterLabs] Fast-failover on 2 nodes + qnetd: qdevice connenction disrupted.

2024-05-06 Thread Klaus Wenninger
On Fri, May 3, 2024 at 8:59 PM wrote: > Hi, > > > > Also, I've done wireshark capture and found great mess in TCP, it > > > seems like connection between qdevice and qnetd really stops for some > > > time and packets won't deliver. > > > > Could you check UDP? I guess there is a lot of UDP packet

Re: [ClusterLabs] "pacemakerd: recover properly from Corosync crash" fix

2024-04-23 Thread Klaus Wenninger
On Tue, Apr 23, 2024 at 10:34 AM Klaus Wenninger wrote: > > > On Tue, Apr 23, 2024 at 9:53 AM NOLIBOS Christophe < > christophe.noli...@thalesgroup.com> wrote: > >> Classified as: {OPEN} >> >> >> >> Other strange thing. >> >> On RHE

Re: [ClusterLabs] "pacemakerd: recover properly from Corosync crash" fix

2024-04-23 Thread Klaus Wenninger
it would be restarted. Klaus > > > *De :* Klaus Wenninger > *Envoyé :* lundi 22 avril 2024 12:41 > *À :* NOLIBOS Christophe > *Cc :* Cluster Labs - All topics related to open-source clustering > welcomed > *Objet :* Re: [ClusterLabs] "pacemakerd: recover properly from

Re: [ClusterLabs] "pacemakerd: recover properly from Corosync crash" fix

2024-04-22 Thread Klaus Wenninger
Maybe pacemaker changed behavior here without syncing enough with corosync behavior. We'll look into that to see which approach is better - restart corosync on failure - or have pacemaker be restarted by systemd which should in turn restart corosync as well. Klaus > > > Thanks a lot.

Re: [ClusterLabs] "pacemakerd: recover properly from Corosync crash" fix

2024-04-22 Thread Klaus Wenninger
d by the process - so that the exit-code could be set to 0 - should be fine. Klaus > > *De :* Klaus Wenninger > *Envoyé :* jeudi 18 avril 2024 20:17 > *À :* NOLIBOS Christophe > *Cc :* Cluster Labs - All topics related to open-source clustering > welcomed > *Objet :* Re: [Clu

Re: [ClusterLabs] "pacemakerd: recover properly from Corosync crash" fix

2024-04-18 Thread Klaus Wenninger
*De la part de* NOLIBOS > Christophe via Users > *Envoyé :* jeudi 18 avril 2024 18:34 > *À :* Klaus Wenninger ; Cluster Labs - All topics > related to open-source clustering welcomed > *Cc :* NOLIBOS Christophe > *Objet :* Re: [ClusterLabs] "pacemakerd: recover properly

Re: [ClusterLabs] "pacemakerd: recover properly from Corosync crash" fix

2024-04-18 Thread Klaus Wenninger
On Thu, Apr 18, 2024 at 6:09 PM Klaus Wenninger wrote: > > > On Thu, Apr 18, 2024 at 6:06 PM NOLIBOS Christophe < > christophe.noli...@thalesgroup.com> wrote: > >> Classified as: {OPEN} >> >> >> >> Well… why do you say that « Well if coros

Re: [ClusterLabs] "pacemakerd: recover properly from Corosync crash" fix

2024-04-18 Thread Klaus Wenninger
On Thu, Apr 18, 2024 at 5:07 PM NOLIBOS Christophe via Users < users@clusterlabs.org> wrote: > Classified as: {OPEN} > > I'm using RedHat 8.8 (4.18.0-477.21.1.el8_8.x86_64). > When I kill Corosync, no new corosync process is created and pacemaker is > in failure. > The only solution is to restart

Re: [ClusterLabs] controlling cluster behavior on startup

2024-01-30 Thread Klaus Wenninger
On Tue, Jan 30, 2024 at 2:21 PM Walker, Chris wrote: > >>> However, now it seems to wait that amount of time before it elects a > >>> DC, even when quorum is acquired earlier. In my log snippet below, > >>> with dc-deadtime 300s, > >> > >> The dc-deadtime is not waiting for quorum, but for anoth

Re: [ClusterLabs] trigger something at ?

2024-01-29 Thread Klaus Wenninger
On Mon, Jan 29, 2024 at 5:22 PM Ken Gaillot wrote: > On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users wrote: > > Hi guys. > > > > Is it possible to trigger some... action - I'm thinking specifically > > at shutdown/start. > > If not within the cluster then - if you do that - perhaps outside.

Re: [ClusterLabs] cluster doesn't do HA as expected, pingd doesn't help

2023-12-19 Thread Klaus Wenninger
On Tue, Dec 19, 2023 at 10:00 AM Andrei Borzenkov wrote: > On Tue, Dec 19, 2023 at 10:41 AM Artem wrote: > ... > > Dec 19 09:48:13 lustre-mds2.ntslab.ru pacemaker-schedulerd[785107] > (update_resource_action_runnable)warning: OST4_stop_0 on lustre4 is > unrunnable (node is offline) > > Dec 1

[ClusterLabs] Pacemaker 2.1.7-rc2 now available

2023-11-24 Thread Klaus Wenninger
Hi all, Source code for the 2nd release candidate for Pacemaker version 2.1.7 is available at: https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.1.7-rc2 This is primarily a bug fix release. See the ChangeLog or the link above for details. Everyone is encouraged to download, buil

Re: [ClusterLabs] PCS ACL for the "pcs cluster stop" command

2023-10-16 Thread Klaus Wenninger
On Fri, Oct 13, 2023 at 9:21 PM Reid Wahl wrote: > On Fri, Oct 13, 2023 at 12:19 PM Reid Wahl wrote: > > > > On Fri, Oct 13, 2023 at 9:56 AM Roberto Rodrigos > wrote: > > > > > > good day! > > > I use the configuration to create an ACL, it is shown below. How can I > restrict access to the "pcs

Re: [ClusterLabs] Syncronous primary doesn't switch to async mode on replica power off

2023-10-06 Thread Klaus Wenninger
On Fri, Oct 6, 2023 at 8:46 AM Sergey Cherukhin wrote: > Hello! > > I used Microsoft Outlook to send this message and it was sent in the wrong > format. I'm sorry. I won't do it again. > > I use Postgresql+Pacemaker+Corosync cluster with 2 Postgresql instances in > synchronous replication mode. P

Re: [ClusterLabs] Users Digest, Vol 104, Issue 5

2023-09-04 Thread Klaus Wenninger via Users
ge with subject or body 'help' to >> users-requ...@clusterlabs.org >> >> You can reach the person managing the list at >> users-ow...@clusterlabs.org >> >> When replying, please edit your Subject line so it is more specific >> than

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-09-04 Thread Klaus Wenninger
On Mon, Sep 4, 2023 at 1:50 PM Andrei Borzenkov wrote: > On Mon, Sep 4, 2023 at 2:18 PM Klaus Wenninger > wrote: > > > > > > > > On Mon, Sep 4, 2023 at 12:45 PM David Dolan > wrote: > >> > >> Hi Klaus, > >> > >> With defau

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-09-04 Thread Klaus Wenninger
On Mon, Sep 4, 2023 at 1:44 PM Andrei Borzenkov wrote: > On Mon, Sep 4, 2023 at 2:25 PM Klaus Wenninger > wrote: > > > > > > Or go for qdevice with LMS where I would expect it to be able to really > go down to > > a single node left - any of the 2 last ones - as

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-09-04 Thread Klaus Wenninger
On Mon, Sep 4, 2023 at 1:18 PM Klaus Wenninger wrote: > > > On Mon, Sep 4, 2023 at 12:45 PM David Dolan wrote: > >> Hi Klaus, >> >> With default quorum options I've performed the following on my 3 node >> cluster >> >> Bring down cluster s

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-09-04 Thread Klaus Wenninger
de is fenced if I bring >> down services on two nodes. >> Thanks >> David >> >> On Thu, 31 Aug 2023 at 11:44, Klaus Wenninger >> wrote: >> >>> >>> >>> On Thu, Aug 31, 2023 at 12:28 PM David Dolan >>> wrote: >>> &g

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-08-31 Thread Klaus Wenninger
On Thu, Aug 31, 2023 at 12:28 PM David Dolan wrote: > > > On Wed, 30 Aug 2023 at 17:35, David Dolan wrote: > >> >> >> > Hi All, >>> > >>> > I'm running Pacemaker on Centos7 >>> > Name: pcs >>> > Version : 0.9.169 >>> > Release : 3.el7.centos.3 >>> > Architecture: x86_64 >>> > >>>

Re: [ClusterLabs] issue during Pacemaker failover testing

2023-08-30 Thread Klaus Wenninger
On Wed, Aug 30, 2023 at 2:34 PM David Dolan wrote: > Hi All, > > I'm running Pacemaker on Centos7 > Name: pcs > Version : 0.9.169 > Release : 3.el7.centos.3 > Architecture: x86_64 > > Besides the pcs-version versions of the other cluster-stack-components could be interesting. (pac

Re: [ClusterLabs] Redis Resource error

2023-08-23 Thread Klaus Wenninger
On Tue, Aug 22, 2023 at 11:31 PM Social Boh wrote: > Hello List, > > I know is not really a Pacemaker/Corosync question relate but I don't > know how solve this error: > > redis_start_0 on kam1.kamailio.xyz 'error' (1): call=148, status='Timed > Out', exitreason='Resource agent did not complete w

Re: [ClusterLabs] no-quorum-policy=ignore is (Deprecated ) and replaced with other options but not an effective solution

2023-06-27 Thread Klaus Wenninger
On Wed, Jun 28, 2023 at 7:38 AM Klaus Wenninger wrote: > > > On Wed, Jun 28, 2023 at 3:30 AM Priyanka Balotra < > priyanka.14balo...@gmail.com> wrote: > >> I am using SLES 15 SP4. Is the no-quorum-policy still supported? >> >> > Thanks >> Pri

Re: [ClusterLabs] no-quorum-policy=ignore is (Deprecated ) and replaced with other options but not an effective solution

2023-06-27 Thread Klaus Wenninger
o-quorum-policy=ignore is actually what >> you want. >> > Still dangerous without something like wait-for-all - right? With LMS I guess you should have the same effect without having explicitly specified though. Klaus > >> > >> > Thanks >> > Priyanka >&

Re: [ClusterLabs] no-quorum-policy=ignore is (Deprecated ) and replaced with other options but not an effective solution

2023-06-27 Thread Klaus Wenninger
On Tue, Jun 27, 2023 at 5:24 PM Andrei Borzenkov wrote: > On 27.06.2023 07:21, Priyanka Balotra wrote: > > Hi Andrei, > > After this state the system went through some more fencings and we saw > the > > following state: > > > > :~ # crm status > > Cluster Summary: > >* Stack: corosync > >

Re: [ClusterLabs] Pacemaker logs written on message which is not expected as per configuration

2023-06-25 Thread Klaus Wenninger
On Fri, Jun 23, 2023 at 3:57 PM S Sathish S via Users wrote: > Hi Team, > > > > The pacemaker logs is written in both '/var/log/messages' and > '/var/log/pacemaker/pacemaker.log'. > > Could you please help us for not write pacemaker processes in > /var/log/messages? Even corosync configuration we

Re: [ClusterLabs] How to block/stop a resource from running twice?

2023-04-24 Thread Klaus Wenninger
On Fri, Apr 21, 2023 at 12:24 PM fs3000 via Users wrote: > Hello all, > > I'm configuring a two node cluster. Pacemaker 0.9.169 on Centos 7. > > guess this is rather the pcs-version ... > How can i configure a specific service to run just on one node and avoid > having it running on more than o

Re: [ClusterLabs] Offtopic - role migration

2023-04-19 Thread Klaus Wenninger
On Tue, Apr 18, 2023 at 9:09 PM Ken Gaillot wrote: > On Tue, 2023-04-18 at 19:50 +0200, Vladislav Bogdanov wrote: > > Btw, an interesting question. How much efforts would it take to > > support a migration of a Master role over the nodes? An use-case is > > drbd, configured for a multi-master mod

Re: [ClusterLabs] VirtualDomain - node map - ?

2023-04-17 Thread Klaus Wenninger
On Mon, Apr 17, 2023 at 6:17 AM Andrei Borzenkov wrote: > On 16.04.2023 16:29, lejeczek via Users wrote: > > > > > > On 16/04/2023 12:54, Andrei Borzenkov wrote: > >> On 16.04.2023 13:40, lejeczek via Users wrote: > >>> Hi guys > >>> > >>> Some agents do employ that concept of node/host map which

Re: [ClusterLabs] resource going to blocked status while we restart service via systemctl twice

2023-04-17 Thread Klaus Wenninger
On Mon, Apr 17, 2023 at 9:25 AM S Sathish S via Users wrote: > Hi Team, > > > > TEST_node1 resource going to blocked status while we restart service via > systemctl twice in less time/before completion of 1st systemctl command. > > In older pacemaker version 2.0.2 we don’t see this issue, only ob

Re: [ClusterLabs] Location not working [FIXED]

2023-04-12 Thread Klaus Wenninger
On Wed, Apr 12, 2023 at 9:27 AM Andrei Borzenkov wrote: > On Tue, Apr 11, 2023 at 6:27 PM Ken Gaillot wrote: > > > > On Tue, 2023-04-11 at 17:31 +0300, Miro Igov wrote: > > > I fixed the issue by changing location definition from: > > > > > > location intranet-ip_on_any_nginx intranet-ip \ > > >

Re: [ClusterLabs] pacemaker-remoted /dev/shm errors

2023-03-06 Thread Klaus Wenninger
On Mon, Mar 6, 2023 at 3:32 PM Christine caulfield wrote: > Hi, > > The error is coming from libqb - which is what manages the local IPC > connections between local clients and the server. > > I'm the libqb maintainer but I've never seen that error before! Is there > anything unusual about the se

Re: [ClusterLabs] resource cloned group colocations

2023-03-02 Thread Klaus Wenninger
On Thu, Mar 2, 2023 at 8:41 AM Gerald Vogt wrote: > Hi, > > I am setting up a mail relay cluster which main purpose is to maintain > the service ips via IPaddr2 and move them between cluster nodes when > necessary. > > The service ips should only be active on nodes which are running all > necessa

Re: [ClusterLabs] cluster with redundant links - PCSD offline

2023-02-28 Thread Klaus Wenninger
On Mon, Feb 27, 2023 at 6:25 PM Ken Gaillot wrote: > On Sun, 2023-02-26 at 18:15 +0100, lejeczek via Users wrote: > > Hi guys. > > > > I have a simple 2-node cluster with redundant links and I wonder why > > status reports like this: > > ... > > Node List: > > * Node swir (1): online, feature s

[ClusterLabs] sbd v1.5.2

2023-01-09 Thread Klaus Wenninger
Hi sbd - developers & users! Thanks to everybody for contributing to tests and further development. Only functional change is the first topic in the list below. And even that is 'just' refusing startup in a case where the config anyway wouldn't have led to a successful cluster startup. Improved

Re: [ClusterLabs] Antw: Re: Antw: [EXT] Re: Stonith

2022-12-21 Thread Klaus Wenninger
On Wed, Dec 21, 2022 at 4:51 PM Ken Gaillot wrote: > On Wed, 2022-12-21 at 10:45 +0100, Ulrich Windl wrote: > > > > > Ken Gaillot schrieb am 20.12.2022 um > > > > > 16:21 in > > Nachricht > > <3a5960c2331f97496119720f6b5a760b3fe3bbcf.ca...@redhat.com>: > > > On Tue, 2022‑12‑20 at 11:33 +0300, An

Re: [ClusterLabs] Antw: [EXT] Re: Bug pacemaker with multiple IP

2022-12-21 Thread Klaus Wenninger
On Wed, Dec 21, 2022 at 11:26 AM Reid Wahl wrote: > On Wed, Dec 21, 2022 at 2:15 AM Ulrich Windl > wrote: > > > > Hi! > > > > I wonder: Could the error message be triggered by adding an exclusive > manatory > > lock in the ip binary? > > If that triggers the bug, I'm rather sure that the error m

Re: [ClusterLabs] Samba failover and Windows access

2022-12-12 Thread Klaus Wenninger
On Sat, Dec 10, 2022 at 6:39 PM Dave Withheld wrote: > On Thu, Dec 8, 2022 at 8:03 AM Dave Withheld > wrote: > > In our production factory, we run a 2-node cluster on CentOS 8 with > pacemaker, a virtual IP, and drbd for shared storage with samba (among > other services) running as a resource on

Re: [ClusterLabs] Samba failover and Windows access

2022-12-07 Thread Klaus Wenninger
On Thu, Dec 8, 2022 at 8:03 AM Dave Withheld wrote: > In our production factory, we run a 2-node cluster on CentOS 8 with > pacemaker, a virtual IP, and drbd for shared storage with samba (among > other services) running as a resource on the active node. Everything works > great except when we f

Re: [ClusterLabs] Unable to build rpm using make rpm command for pacemaker-2.1.4.

2022-11-22 Thread Klaus Wenninger
On Tue, Nov 22, 2022 at 1:16 PM S Sathish S via Users wrote: > Hi Ken/Team, > > We have tried on pacemaker 2.1.1 also faced same issue , later we have > perform below steps as workaround to build pacemaker rpm as you said it run > from a git checkout and build rpm. > > #./autogen.sh > #./configur

Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-15 Thread Klaus Wenninger
On Sat, Nov 5, 2022 at 9:45 PM Jehan-Guillaume de Rorthais via Users < users@clusterlabs.org> wrote: > On Sat, 5 Nov 2022 20:53:09 +0100 > Valentin Vidić via Users wrote: > > > On Sat, Nov 05, 2022 at 06:47:59PM +, Robert Hayden wrote: > > > That was my impression as well...so I may have some

Re: [ClusterLabs] [External] : Re: Fence Agent tests

2022-11-15 Thread Klaus Wenninger
On Wed, Nov 9, 2022 at 2:58 PM Robert Hayden wrote: > > > -Original Message- > > From: Users On Behalf Of Andrei > > Borzenkov > > Sent: Wednesday, November 9, 2022 2:59 AM > > To: Cluster Labs - All topics related to open-source clustering welcomed > > > > Subject: Re: [ClusterLabs] [E

Re: [ClusterLabs] crm resource trace

2022-10-24 Thread Klaus Wenninger
o:* Pacemaker ML > *Subject:* Re: [ClusterLabs] crm resource trace > > > - On 24 Oct, 2022, at 10:08, Klaus Wenninger kwenn...@redhat.com > wrote: > > > On Mon, Oct 24, 2022 at 9:50 AM Xin Liang via Users < [ > > mailto:users@clusterlabs.org | > users@cluster

Re: [ClusterLabs] crm resource trace

2022-10-24 Thread Klaus Wenninger
On Mon, Oct 24, 2022 at 10:46 AM Lentes, Bernd < bernd.len...@helmholtz-muenchen.de> wrote: > > - On 24 Oct, 2022, at 10:08, Klaus Wenninger kwenn...@redhat.com > wrote: > > > On Mon, Oct 24, 2022 at 9:50 AM Xin Liang via Users < [ > > mailto:users@cluste

Re: [ClusterLabs] crm resource trace

2022-10-24 Thread Klaus Wenninger
On Mon, Oct 24, 2022 at 9:50 AM Xin Liang via Users wrote: > Hi Bernd, > > I got it, you are on SLE12SP5, and the crmsh version > is crmsh-4.1.1+git.1647830282.d380378a-2.74.2.noarch, right? > > I try to reproduce this inconsistent behavior, add an IPaddr2 agent vip, > run `crm resource trace vip

Re: [ClusterLabs] crm resource trace

2022-10-18 Thread Klaus Wenninger
On Mon, Oct 17, 2022 at 9:42 PM Ken Gaillot wrote: > This turned out to be interesting. > > In the first case, the resource history contains a start action and a > recurring monitor. The parameters to both change, so the resource > requires a restart. > > In the second case, the resource's histor

Re: [ClusterLabs] RFE: sdb clone

2022-09-27 Thread Klaus Wenninger
On Tue, Sep 20, 2022 at 3:59 PM Ulrich Windl < ulrich.wi...@rz.uni-regensburg.de> wrote: > Hi! > > I have a proposal (request) for enhancing sbd: > (I'm not suggesting a complete rewrite with reasonable options, as I had > don that before already ;-)) > When configuring an additional disk device,

Re: [ClusterLabs] (no subject)

2022-09-07 Thread Klaus Wenninger
On Wed, Sep 7, 2022 at 12:28 PM Jehan-Guillaume de Rorthais via Users wrote: > > Hey, > > On Wed, 7 Sep 2022 19:12:53 +0900 > 권오성 wrote: > > > Hello. > > I am a student who wants to implement a redundancy system with raspberry pi. > > Last time, I posted about how to proceed with installation on

Re: [ClusterLabs] Cluster does not start resources

2022-08-25 Thread Klaus Wenninger
On Wed, Aug 24, 2022 at 6:29 PM Lentes, Bernd wrote: > > > - On 24 Aug, 2022, at 16:26, kwenning kwenn...@redhat.com wrote: > > >> > >> if I get Ulrich right - and my fading memory of when I really used crmsh > >> the > >> last time is telling me the same thing ... > >> > > I get the impressi

Re: [ClusterLabs] Cluster does not start resources

2022-08-24 Thread Klaus Wenninger
On Wed, Aug 24, 2022 at 4:24 PM Klaus Wenninger wrote: > > On Wed, Aug 24, 2022 at 2:40 PM Lentes, Bernd > wrote: > > > > > > - On 24 Aug, 2022, at 07:21, Reid Wahl nw...@redhat.com wrote: > > > > > > > As a result, your command might star

Re: [ClusterLabs] Cluster does not start resources

2022-08-24 Thread Klaus Wenninger
On Wed, Aug 24, 2022 at 2:40 PM Lentes, Bernd wrote: > > > - On 24 Aug, 2022, at 07:21, Reid Wahl nw...@redhat.com wrote: > > > > As a result, your command might start the virtual machines, but > > Pacemaker will still show that the resources are "Stopped (disabled)". > > To fix that, you'll n

Re: [ClusterLabs] Start resource only if another resource is stopped

2022-08-19 Thread Klaus Wenninger
On Thu, Aug 18, 2022 at 8:26 PM Andrei Borzenkov wrote: > > On 17.08.2022 16:58, Miro Igov wrote: > > As you guessed i am using crm res stop nfs_export_1. > > I tried the solution with attribute and it does not work correct. > > > > It does what you asked for originally, but you are shifting the >

Re: [ClusterLabs] node1 and node2 communication time question

2022-08-09 Thread Klaus Wenninger
On Wed, Aug 10, 2022 at 3:49 AM 권오성 wrote: > > Thank you for your reply. > Then, can I think of it as being able to adjust the time by changing the > token in /etc/corosync/corosync.conf? That would basically be the time after which a non responsive node in a cluster would be declared dead and

Re: [ClusterLabs] Q: About a false negative of storage_mon

2022-08-05 Thread Klaus Wenninger
On Fri, Aug 5, 2022 at 9:30 AM Kazunori INOUE wrote: > > On Tue, Aug 2, 2022 at 11:09 PM Ken Gaillot wrote: > > > > On Tue, 2022-08-02 at 19:13 +0900, 井上和徳 wrote: > > > Hi, > > > > > > Since O_DIRECT is not specified in open() [1], it reads the buffer > > > cache and > > > may result in a false n

Re: [ClusterLabs] Antw: [EXT] Re: Q: About a false negative of storage_mon

2022-08-03 Thread Klaus Wenninger
On Wed, Aug 3, 2022 at 4:02 PM Ulrich Windl wrote: > > >>> Klaus Wenninger schrieb am 03.08.2022 um 15:51 in > Nachricht > : > > On Tue, Aug 2, 2022 at 4:10 PM Ken Gaillot wrote: > >> > >> On Tue, 2022-08-02 at 19:13 +0900, 井上和徳 wrote: > >>

Re: [ClusterLabs] Q: About a false negative of storage_mon

2022-08-03 Thread Klaus Wenninger
On Tue, Aug 2, 2022 at 4:10 PM Ken Gaillot wrote: > > On Tue, 2022-08-02 at 19:13 +0900, 井上和徳 wrote: > > Hi, > > > > Since O_DIRECT is not specified in open() [1], it reads the buffer > > cache and > > may result in a false negative. I fear that this possibility > > increases > > in environments w

Re: [ClusterLabs] pacemaker-fenced[11637]: warning: Can't create a sane reply

2022-06-22 Thread Klaus Wenninger
On Wed, Jun 22, 2022 at 1:46 PM Priyanka Balotra wrote: > > Hi All, > > We are seeing an issue where we performed cluster shutdown followed by > cluster boot operation. All the nodes joined the cluster excet one (the first > node). Here are some pacemaker logs around that timestamp: > > 2022-06-

Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] Re: Why not retry a monitor (pacemaker‑execd) that got a segmentation fault?

2022-06-15 Thread Klaus Wenninger
On Wed, Jun 15, 2022 at 2:10 PM Ulrich Windl wrote: > > >>> Klaus Wenninger schrieb am 15.06.2022 um 13:22 in > Nachricht > : > > On Wed, Jun 15, 2022 at 10:33 AM Ulrich Windl > > wrote: > >> > > ... > > >> (As said abov

Re: [ClusterLabs] Antw: Re: Antw: [EXT] Re: Why not retry a monitor (pacemaker‑execd) that got a segmentation fault?

2022-06-15 Thread Klaus Wenninger
On Wed, Jun 15, 2022 at 10:33 AM Ulrich Windl wrote: > > >>> Klaus Wenninger schrieb am 15.06.2022 um 10:00 in > Nachricht > : > > On Wed, Jun 15, 2022 at 8:32 AM Ulrich Windl > > wrote: > >> > >> >>> Ulrich Windl schri

Re: [ClusterLabs] Antw: [EXT] Re: Why not retry a monitor (pacemaker‑execd) that got a segmentation fault?

2022-06-15 Thread Klaus Wenninger
On Wed, Jun 15, 2022 at 8:32 AM Ulrich Windl wrote: > > >>> Ulrich Windl schrieb am 14.06.2022 um 15:53 in Nachricht <62A892F0.174 : > >>> 161 : > 60728>: > > ... > > Yes it's odd, but isn't the cluster just to protect us from odd situations? > > ;-) > > I have more odd stuff: > Jun 14 20:40:09 r

Re: [ClusterLabs] fencing configuration

2022-06-07 Thread Klaus Wenninger
On Tue, Jun 7, 2022 at 10:27 AM Zoran Bošnjak wrote: > > Hi, I need some help with correct fencing configuration in 5-node cluster. > > The speciffic issue is that there are 3 rooms, where in addition to node > failure scenario, each room can fail too (for example in case of room power > failure

Re: [ClusterLabs] Antw: [EXT] Re: normal reboot with active sbd does not work

2022-06-07 Thread Klaus Wenninger
On Tue, Jun 7, 2022 at 7:53 AM Ulrich Windl wrote: > > >>> Andrei Borzenkov schrieb am 03.06.2022 um 17:04 in > Nachricht <99f7746a-c962-33bb-6737-f88ba0128...@gmail.com>: > > On 03.06.2022 16:51, Zoran Bošnjak wrote: > >> Thanks for all your answers. Sorry, my mistake. The ipmi_watchdog is indee

Re: [ClusterLabs] normal reboot with active sbd does not work

2022-06-03 Thread Klaus Wenninger
On Fri, Jun 3, 2022 at 3:51 PM Zoran Bošnjak wrote: > > Thanks for all your answers. Sorry, my mistake. The ipmi_watchdog is indeed > OK. I was first experimenting with "softdog", which is blacklisted. So the > reasonable question is how to properly start "softdog" on ubuntu. > > The reason to u

Re: [ClusterLabs] normal reboot with active sbd does not work

2022-06-03 Thread Klaus Wenninger
On Fri, Jun 3, 2022 at 11:03 AM Klaus Wenninger wrote: > > On Fri, Jun 3, 2022 at 10:19 AM Zoran Bošnjak wrote: > > > > Hi all, > > I would appreciate an advice about sbd fencing (without shared storage). > > > > I am using ubuntu 20.04., with default packages

Re: [ClusterLabs] normal reboot with active sbd does not work

2022-06-03 Thread Klaus Wenninger
On Fri, Jun 3, 2022 at 10:19 AM Zoran Bošnjak wrote: > > Hi all, > I would appreciate an advice about sbd fencing (without shared storage). > > I am using ubuntu 20.04., with default packages from the repository > (pacemaker, corosync, fence-agents, ipmitool, pcs...). > > HW watchdog is present o

Re: [ClusterLabs] Antw: [EXT] Re: Cluster unable to find back together

2022-05-23 Thread Klaus Wenninger
On Fri, May 20, 2022 at 7:43 AM Ulrich Windl wrote: > > >>> Jan Friesse schrieb am 19.05.2022 um 14:55 in > Nachricht > <1abb8468-6619-329f-cb01-3f51112db...@redhat.com>: > > Hi, > > > > On 19/05/2022 10:16, Leditzky, Fabian via Users wrote: > >> Hello > >> > >> We have been dealing with our pace

Re: [ClusterLabs] Antw: Re: Antw: [EXT] Re: Q: How to clean up a failed fencing operation?

2022-05-13 Thread Klaus Wenninger
On Fri, May 13, 2022 at 12:12 PM Ulrich Windl wrote: > > >>> Klaus Wenninger schrieb am 13.05.2022 um 08:22 in > Nachricht > : > > On Tue, May 3, 2022 at 11:53 AM Ulrich Windl > > wrote: > >> > >> >>> Reid Wahl schrieb am 03.05.2022

Re: [ClusterLabs] Antw: [EXT] Re: Q: How to clean up a failed fencing operation?

2022-05-12 Thread Klaus Wenninger
On Tue, May 3, 2022 at 11:53 AM Ulrich Windl wrote: > > >>> Reid Wahl schrieb am 03.05.2022 um 10:16 in Nachricht > : > > On Tue, May 3, 2022 at 12:36 AM Ulrich Windl > > wrote: > >> > >> Hi! > >> > >> I'm familiar with cleaning up various failed resource actions via > > "crm_resource ‑C ‑r reso

Re: [ClusterLabs] Can a two node cluster start resources if only one node is booted?

2022-04-22 Thread Klaus Wenninger
On Thu, Apr 21, 2022 at 8:18 PM john tillman wrote: > > > On 21.04.2022 18:26, john tillman wrote: > >>> Dne 20. 04. 22 v 20:21 john tillman napsal(a): > > On 20.04.2022 19:53, john tillman wrote: > >> I have a two node cluster that won't start any resources if only one > >> node > >>>

Re: [ClusterLabs] Antw: Re: Antw: [EXT] Coming in 2.1.3: node health monitoring improvements

2022-04-14 Thread Klaus Wenninger
On Thu, Apr 14, 2022 at 7:57 AM Ulrich Windl wrote: > > Ken, > > thanks for thje explanations! Maybe it would be best (next time) if you > present the documentation for a new feature first (as a base for discussion), > and _then_ implement it. > I know: People first implement it, and later, if the

Re: [ClusterLabs] Resources too_active (active on all nodes of the cluster, instead of only 1 node)

2022-03-29 Thread Klaus Wenninger
On Thu, Mar 24, 2022 at 4:12 PM Ken Gaillot wrote: > > On Wed, 2022-03-23 at 05:30 +, Balotra, Priyanka wrote: > > Hi All, > > > > We have a scenario on SLES 12 SP3 cluster. > > The scenario is explained as follows in the order of events: > > There is a 2-node cluster (FILE-1, FILE-2) > > Th

Re: [ClusterLabs] Request for ideas: Cluster node summary in 14 characters

2022-03-17 Thread Klaus Wenninger
On Thu, Mar 17, 2022 at 4:16 PM Ulrich Windl < ulrich.wi...@rz.uni-regensburg.de> wrote: > Hi! > > I had the idea to display the status of a cluster node on the 14-character > LCD display of a Dell PowerEdge server; preferably displaying the hostname > at least partially, too ;-) > > Now, what wou

Re: [ClusterLabs] Noticed oddity when DC is going to be fenced

2022-03-01 Thread Klaus Wenninger
On Tue, Mar 1, 2022 at 10:05 AM Ulrich Windl < ulrich.wi...@rz.uni-regensburg.de> wrote: > Hi! > > For current SLES15 SP3 I noticed an oddity when the node running the DC is > going to be fenced: > It seems that another node is performing recovery operations while the old > DC is not confirmed to

Re: [ClusterLabs] Q: fence_kdump and fence_kdump_send

2022-02-28 Thread Klaus Wenninger
On Mon, Feb 28, 2022 at 2:46 PM Klaus Wenninger wrote: > > > On Sat, Feb 26, 2022 at 7:14 AM Strahil Nikolov via Users < > users@clusterlabs.org> wrote: > >> I always used this one for triggering kdump when using sbd: >> https://www.suse.com/support/kb/doc/?

Re: [ClusterLabs] Q: fence_kdump and fence_kdump_send

2022-02-28 Thread Klaus Wenninger
On Sat, Feb 26, 2022 at 7:14 AM Strahil Nikolov via Users < users@clusterlabs.org> wrote: > I always used this one for triggering kdump when using sbd: > https://www.suse.com/support/kb/doc/?id=19873 > > On Fri, Feb 25, 2022 at 21:34, Reid Wahl > wrote: > On Fri, Feb 25, 2022 at 3:47 AM Andre

Re: [ClusterLabs] Antw: Re: Antw: [EXT] Re: Q: sbd: Which parameter controls "error: servant_md: slot read failed in servant."?

2022-02-17 Thread Klaus Wenninger
On Thu, Feb 17, 2022 at 12:38 PM Ulrich Windl < ulrich.wi...@rz.uni-regensburg.de> wrote: > >>> Klaus Wenninger schrieb am 17.02.2022 um 10:49 > in > Nachricht > : > ... > >> For completeness: Yes, sbd did recover: > >> Feb 14 13:01:42 h18 sbd[

Re: [ClusterLabs] Antw: [EXT] Re: Q: sbd: Which parameter controls "error: servant_md: slot read failed in servant."?

2022-02-17 Thread Klaus Wenninger
On Thu, Feb 17, 2022 at 10:14 AM Ulrich Windl < ulrich.wi...@rz.uni-regensburg.de> wrote: > >>> Klaus Wenninger schrieb am 16.02.2022 um 16:59 > in > Nachricht > : > > On Wed, Feb 16, 2022 at 4:26 PM Klaus Wenninger > wrote: > > > >> >

Re: [ClusterLabs] Antw: [EXT] Re: Q: sbd: Which parameter controls "error: servant_md: slot read failed in servant."?

2022-02-17 Thread Klaus Wenninger
On Thu, Feb 17, 2022 at 9:27 AM Ulrich Windl < ulrich.wi...@rz.uni-regensburg.de> wrote: > >>> Klaus Wenninger schrieb am 16.02.2022 um 16:26 > in > Nachricht > : > > On Wed, Feb 16, 2022 at 3:09 PM Ulrich Windl < > > ulrich.wi...@rz.uni-regensburg.de>

Re: [ClusterLabs] Q: sbd: Which parameter controls "error: servant_md: slot read failed in servant."?

2022-02-16 Thread Klaus Wenninger
On Wed, Feb 16, 2022 at 4:59 PM Klaus Wenninger wrote: > > > On Wed, Feb 16, 2022 at 4:26 PM Klaus Wenninger > wrote: > >> >> >> On Wed, Feb 16, 2022 at 3:09 PM Ulrich Windl < >> ulrich.wi...@rz.uni-regensburg.de> wrote: >> >>>

Re: [ClusterLabs] Q: sbd: Which parameter controls "error: servant_md: slot read failed in servant."?

2022-02-16 Thread Klaus Wenninger
On Wed, Feb 16, 2022 at 4:26 PM Klaus Wenninger wrote: > > > On Wed, Feb 16, 2022 at 3:09 PM Ulrich Windl < > ulrich.wi...@rz.uni-regensburg.de> wrote: > >> Hi! >> >> When changing some FC cables I noticed that sbd complained 2 seconds >> after the

Re: [ClusterLabs] Q: sbd: Which parameter controls "error: servant_md: slot read failed in servant."?

2022-02-16 Thread Klaus Wenninger
On Wed, Feb 16, 2022 at 3:09 PM Ulrich Windl < ulrich.wi...@rz.uni-regensburg.de> wrote: > Hi! > > When changing some FC cables I noticed that sbd complained 2 seconds after > the connection went down (event though the device is multi-pathed with > other paths being still up). > I don't know any s

Re: [ClusterLabs] ethernet link up/down - ?

2022-02-16 Thread Klaus Wenninger
On Tue, Feb 15, 2022 at 5:25 PM lejeczek via Users wrote: > > > On 07/02/2022 19:21, Antony Stone wrote: > > On Monday 07 February 2022 at 20:09:02, lejeczek via Users wrote: > > > >> Hi guys > >> > >> How do you guys go about doing link up/down as a resource? > > I apply or remove addresses on t

Re: [ClusterLabs] Antw: [EXT] Cluster Removing VIP and Not Following Order Constraint

2022-02-11 Thread Klaus Wenninger
On Fri, Feb 11, 2022 at 9:13 AM Strahil Nikolov via Users < users@clusterlabs.org> wrote: > Shouldn't you use kind ' Mandatory' and simetrical TRUE ? > > If true, the reverse of the constraint applies for the opposite action > (for example, if B starts after A starts, then B stops before A stops).

Re: [ClusterLabs] Is there a python package for pacemaker ?

2022-02-03 Thread Klaus Wenninger
On Wed, Feb 2, 2022 at 7:06 PM Ken Gaillot wrote: > On Wed, 2022-02-02 at 18:46 +0100, Lentes, Bernd wrote: > > Hi, > > > > i need to write some scripts for our cluster. Until now i wrote bash > > scripts. > > But i like to learn python. Is there a package for pacemaker ? > > What i found is: htt

Re: [ClusterLabs] Antw: [EXT] Removing a resource without stopping it

2022-01-31 Thread Klaus Wenninger
On Mon, Jan 31, 2022 at 2:43 PM Jehan-Guillaume de Rorthais wrote: > On Mon, 31 Jan 2022 08:49:44 +0100 > Klaus Wenninger wrote: > ... > > Depending on the environment it might make sense to think about > > having the manual migration-step controlled by the cluster(s)

Re: [ClusterLabs] Antw: [EXT] Removing a resource without stopping it

2022-01-30 Thread Klaus Wenninger
On Mon, Jan 31, 2022 at 8:19 AM Ulrich Windl < ulrich.wi...@rz.uni-regensburg.de> wrote: > >>> Digimer schrieb am 28.01.2022 um 22:38 in Nachricht > : > > Hi all, > > > >I'm trying to figure out how to move a running VM from one pacemaker > > cluster to another. I've got the storage and VM li

[ClusterLabs] sbd v1.5.1

2021-11-15 Thread Klaus Wenninger
Hi sbd - developers & users! Thanks to everybody for contributing to tests and further development. Changes since 1.5.1 - improve/fix cmdline handling - tell the actual watchdog device specified with -w - tolerate and strip any leading spaces of commandline option values - Sanitize numeri

Re: [ClusterLabs] Fence node when network interface goes down

2021-11-15 Thread Klaus Wenninger
On Mon, Nov 15, 2021 at 12:19 PM Andrei Borzenkov wrote: > On Mon, Nov 15, 2021 at 1:18 PM Klaus Wenninger > wrote: > > > > > > > > On Mon, Nov 15, 2021 at 10:37 AM S Rogers > wrote: > >> > >> I had thought about doing that, but the cluster is

Re: [ClusterLabs] Fence node when network interface goes down

2021-11-15 Thread Klaus Wenninger
On Mon, Nov 15, 2021 at 10:37 AM S Rogers wrote: > I had thought about doing that, but the cluster is then dependent on the > external system, and if that external system was to go down or become > unreachable for any reason then it would falsely cause the cluster to > failover or worse it could

Re: [ClusterLabs] Antw: [EXT] Re: VirtualDomain & "deeper" monitors - what/how?

2021-10-26 Thread Klaus Wenninger
On Mon, Oct 25, 2021 at 9:34 PM Kyle O'Donnell wrote: > Finally got around to working on this. > > I spoke with someone on the #cluterslabs IRC channel who mentioned that > the monitor_scripts param does indeed run at some frequency (op monitor > timeout=? interval=?), not just during the "start"

Re: [ClusterLabs] Antw: [EXT] DRBD split‑brain investigations, automatic fixes and manual intervention...

2021-10-20 Thread Klaus Wenninger
On Wed, Oct 20, 2021 at 12:06 PM Ian Diddams via Users < users@clusterlabs.org> wrote: > FWIW here is the basis for my implementation being the "best" and easily > followed drbd/clustering guide/explanantiojn I could find when I searched > > Lisenet.com :: Linux | Security | Networking | Admin Blo

Re: [ClusterLabs] Trying to understand dampening (ping)

2021-10-15 Thread Klaus Wenninger
On Fri, Oct 15, 2021 at 12:01 PM Andrei Borzenkov wrote: > On Fri, Oct 15, 2021 at 9:25 AM Klaus Wenninger > wrote: > > > Main pain-point here is that ping-RA allows us to configure the count of > pings sent, but it > > is just using the exit-value from ping that becomes

Re: [ClusterLabs] Trying to understand dampening (ping)

2021-10-14 Thread Klaus Wenninger
On Thu, Oct 14, 2021 at 10:51 PM martin doc wrote: > > > -- > *From: *Andrei Borzenkov , Friday, 15 October 2021 > 4:59 AM > *...* > > Dampening defines delay before attributes are committed to CIB. > > Private attributes are never ever written into CIB, so dampening

  1   2   3   4   5   6   >