Re: [ClusterLabs] DRBD + VDO HowTo?

2021-05-18 Thread Strahil Nikolov
>That was the first thing I tried. The systemd service does not work because it >wants to stop and start all vdo devices, but mine are on different nodes.  That's why I mentioned to create your own version of the systemd service. Best Regards,Strahil

Re: [ClusterLabs] DRBD + VDO HowTo?

2021-05-18 Thread Strahil Nikolov
And why don't you use your own systemd service ? Best Regards,Strahil Nikolov___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] DRBD + VDO HowTo?

2021-05-17 Thread Strahil Nikolov
Have you tried to set VDO in async mode ? Best Regards,Strahil Nikolov On Mon, May 17, 2021 at 8:57, Klaus Wenninger wrote: Did you try VDO in sync-mode for the case the flush-fua stuff isn't working through the layers? Did you check that VDO-service is disabled and solely under

Re: [ClusterLabs] DRBD + VDO HowTo?

2021-05-16 Thread Strahil Nikolov
Are you sure that the DRBD is working properly ? Best Regards,Strahil Nikolov On Mon, May 17, 2021 at 0:32, Eric Robinson wrote: #yiv0265739749 #yiv0265739749 -- _filtered {} _filtered {} _filtered {}#yiv0265739749 #yiv0265739749 p.yiv0265739749MsoNormal, #yiv0265739749

Re: [ClusterLabs] Problem with the cluster becoming mostly unresponsive

2021-05-15 Thread Strahil Nikolov
hod and use stonith topology. Best Regards,Strahil Nikolov ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] DRBD + VDO HowTo?

2021-05-14 Thread Strahil Nikolov
There is no VDO RA according to my knowledge, but you can use systemd service as a resource. Yet, the VDO service that comes with thr OS is a generic one and controlls all VDOs - so you need to create your own vdo service. Best Regards,Strahil Nikolov On Fri, May 14, 2021 at 6:55, Eric

Re: [ClusterLabs] DRBD + VDO HowTo?

2021-05-14 Thread Strahil Nikolov
For DRBD there is enough info, so let's focus on VDO.There is a systemd service that starts all VDOs on the system. You can create the VDO once drbs is open for writes and then you can create your own systemd '.service' file which can be used as a cluster resource. Best Regards,Strahil

Re: [ClusterLabs] Antw: [EXT] Moving multi-state resources

2021-05-13 Thread Strahil Nikolov
If something moves in/out in a non-expected way, always check:pcs constraint location --full | grep cli Best Regards,Strahil Nikolov On Thu, May 13, 2021 at 10:45, Andrei Borzenkov wrote: On Wed, May 12, 2021 at 8:15 PM Alastair Basden wrote: > > > > On 12.05.2021 20:02, Ala

Re: [ClusterLabs] Location of High Availability Repositories?

2021-05-13 Thread Strahil Nikolov
On EL8 I think it was named policycoreutils-python-tools or something similar. Best Regards,Strahil Nikolov On Thu, May 13, 2021 at 2:45, Eric Robinson wrote: ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users

Re: [ClusterLabs] Sub-clusters / super-clusters?

2021-05-11 Thread Strahil Nikolov
ocation constraint between the resources. Best Regards,Strahil Nikolov On Mon, May 10, 2021 at 17:53, Antony Stone wrote: On Monday 10 May 2021 at 16:49:07, Strahil Nikolov wrote: > You can use  node attributes to define in which  city is each host and then > use a location constraint to con

Re: [ClusterLabs] bit of wizardry bit of trickery needed.

2021-05-11 Thread Strahil Nikolov
Oh wrong thread, just ignore . Best Regards On Tue, May 11, 2021 at 13:54, Strahil Nikolov wrote: Here is the example I had promised: pcs node attribute server1 city=LApcs node attribute server2 city=NY # Don't run on any node that is not in LApcs constraint location DummyRes1 rule score

Re: [ClusterLabs] bit of wizardry bit of trickery needed.

2021-05-11 Thread Strahil Nikolov
ocation constraint between the resources. Best Regards,Strahil Nikolov On Tue, May 11, 2021 at 9:15, Klaus Wenninger wrote: On 5/10/21 7:16 PM, lejeczek wrote: > > > On 10/05/2021 17:04, Andrei Borzenkov wrote: >> On 10.05.2021 16:48, lejeczek wrote: >>> Hi guys >

Re: [ClusterLabs] Sub-clusters / super-clusters?

2021-05-10 Thread Strahil Nikolov
You can use  node attributes to define in which  city is each host and then use a location constraint to control in which city to run/not run the resources. I will try to provide an example tomorrow. Best Regards,Strahil Nikolov On Mon, May 10, 2021 at 15:52, Antony Stone wrote

Re: [ClusterLabs] fencing

2021-05-08 Thread Strahil Nikolov
If you have SAN & Hardware Watchdog device, you can also use SBD.If SAN is lost and nodes cannot communicate - they will suicide. Best Regards,Strahil Nikolov___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs

Re: [ClusterLabs] Resolving cart before the horse with mounted filesystems.

2021-04-30 Thread Strahil Nikolov
Ken ment yo use 'Filesystem' resourse for mounting that NFS server and then clone that resource. Best Regards,Strahil Nikolov On Fri, Apr 30, 2021 at 18:44, Matthew Schumacher wrote: On 4/30/21 8:11 AM, Ken Gaillot wrote >> 2.  Make the nfs mount itself a resource and make Virtual

Re: [ClusterLabs] VirtualDomain & "deeper" monitors - what/how?

2021-04-30 Thread Strahil Nikolov
Hey Ken, does this feature work for other Nagios stuff ? Best Regards,Strahil Nikolov On Fri, Apr 30, 2021 at 17:57, Ken Gaillot wrote: On Fri, 2021-04-30 at 11:00 +0100, lejeczek wrote: > Hi guys > > I'd like to ask around for thoughts & suggestions on any > se

Re: [ClusterLabs] Autostart/Enabling of Pacemaker and corosync

2021-04-26 Thread Strahil Nikolov
to hardware issues causes performance degradation on the current master. Both cases have their benefits and drawbacks and you have to weight them all before taking that decision. Best Regards,Strahil Nikolov On Mon, Apr 26, 2021 at 20:04, Moneta, Howard wrote: Hello community.  I have

Re: [ClusterLabs] Preventing multiple resources from moving at the same time.

2021-04-22 Thread Strahil Nikolov
In order iSCSI to be transparent to the relevant clients, you need to use a special resource that blocks the iSCSI port during the failover. TCP will retransmit during the failover and will never receive an error due to the fact that the VIP is missing. The name is ocf:heartbeat:portblock that

Re: [ClusterLabs] Node fenced for unknown reason

2021-04-19 Thread Strahil Nikolov
IPMI fencing on some vendors will first try graceful shutdown and only then it will use ungraceful. Disabling power button is also described in  https://access.redhat.com/solutions/1578823  Best Regards,Strahil Nikolov___ Manage your subscription:

Re: [ClusterLabs] Single-node automated startup question

2021-04-14 Thread Strahil Nikolov
If it's a VM or container - it should be on a third location. Using a VM hosted on one of the nodes is like giving that node more votes in a two-node cluster. Cheap 3rd node for quorum makes more sense to me. Best Regards,Strahil Nikolov On Wed, Apr 14, 2021 at 21:19, Antony Stone wrote

Re: [ClusterLabs] Single-node automated startup question

2021-04-14 Thread Strahil Nikolov
What about a small form factor device to serve as a quorum maker ? Best Regards,Strahil Nikolov___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] best practice for scripting

2021-04-13 Thread Strahil Nikolov
By the way , how do you monitor your pacemaker clusters ?We are using Nagios and I found only 'check_crm' but it looks like it was made for crmsh and most probably won't work with pcs without modifications. Best Regards,Strahil Nikolov On Tue, Apr 13, 2021 at 10:57, d tbsky wrote: Ulrich

Re: [ClusterLabs] PAF resource agent & stickiness - ?

2021-04-11 Thread Strahil Nikolov
Better check for a location constraint created via 'pcs resource move'!pcs constraint location --full | grep cli Best Regards,Strahil Nikolov On Sat, Apr 10, 2021 at 18:19, Jehan-Guillaume de Rorthais wrote: Le 10 avril 2021 14:22:34 GMT+02:00, lejeczek a écrit : >Hi guys. >

Re: [ClusterLabs] Antw: [EXT] Re: how to setup single node cluster

2021-04-08 Thread Strahil Nikolov
Maybe booth can take care when it dies and powers up the resource in the DR. Best Regards,Strahil Nikolov On Thu, Apr 8, 2021 at 10:28, Ulrich Windl wrote: >>> Reid Wahl schrieb am 08.04.2021 um 08:32 in Nachricht : > On Wed, Apr 7, 2021 at 11:27 PM d tbsky wrote: >

Re: [ClusterLabs] how to setup single node cluster

2021-04-08 Thread Strahil Nikolov
I always though that the setup is the same, just the node count is only one. I guess you need pcs, corosync + pacemaker.If RH is going to support it, they will require fencing. Most probably sbd or ipmi are the best candidates. Best Regards,Strahil Nikolov On Thu, Apr 8, 2021 at 6:52, d

Re: [ClusterLabs] "iscsi.service: Unit cannot be reloaded because it is inactive."

2021-04-06 Thread Strahil Nikolov
it and it never ended into centos's wiki. Best Regards,Strahil Nikolov On Sat, Apr 3, 2021 at 17:52, Andrei Borzenkov wrote: On 03.04.2021 17:35, Jason Long wrote: > Hello, > I configure my clustering labs with three nodes. You have two node cluster. What is running on nodes outside of c

Re: [ClusterLabs] SAPHanaController & SAPHanaTopology question

2021-04-05 Thread Strahil Nikolov
: Mandatory) And also resource sets that take care that all FS start and then the relevant nfs_active resources. Also, It seems that regular order rules cannot be removed via ID , maybe a Feature request is needed. Best Regards,Strahil Nikolov If you mean a whole constraint set, then yes -- run

Re: [ClusterLabs] SAPHanaController & SAPHanaTopology question

2021-04-02 Thread Strahil Nikolov
and it looks safe to be killed (will check with SAP about that). P.S: Is there a way to remove a whole set in pcs , cause it's really irritating when the stupid command wipes the resource from multiple order constraints? Best Regards,Strahil Nikolov On Fri, Apr 2, 2021 at 23:44, Reid Wahl

[ClusterLabs] SAPHanaController & SAPHanaTopology question

2021-04-02 Thread Strahil Nikolov
the stop timeout  - leads to fencing(on-fail=fence). I thought that the Controller resource agent is stopping the HANA and the slave role should not be 'stopped' before that . Maybe my expectations are wrong ? Best Regards,Strahil Nikolov ___ Manage your

Re: [ClusterLabs] Antw: [EXT] Colocation per site ?

2021-03-31 Thread Strahil Nikolov
lot of custom stuff - I want to make it fool-proof as much as possible. I've already organised a discussion about those backup IPs. Best Regards,Strahil Nikolov On Wed, Mar 31, 2021 at 10:54, Andrei Borzenkov wrote: On Wed, Mar 31, 2021 at 8:34 AM Strahil Nikolov wrote: > > Damn..

Re: [ClusterLabs] Antw: [EXT] Colocation per site ?

2021-03-31 Thread Strahil Nikolov
Damn... I am too hasty. It seems that the 2 resources I have already configured are also running on the master. The colocation constraint is like: rsc_bkpip3_SAPHana_SID_HDBinst_num with rsc_SAPHana_SID_HDBinst_num-clone (score: INFINITY) (node-attribute:hana_sid_site) (rsc-role:Started)

Re: [ClusterLabs] Antw: [EXT] Colocation per site ?

2021-03-31 Thread Strahil Nikolov
Disregard the previous one... it needs 'pcs constraint colocation add' to work. Best Regards,Strahil Nikolov On Wed, Mar 31, 2021 at 8:08, Strahil Nikolov wrote: I guess that feature was added in a later version (still on RHEL 8.2). pcs constraint colocation bkp2 with Master

Re: [ClusterLabs] Antw: [EXT] Colocation per site ?

2021-03-31 Thread Strahil Nikolov
can share the 'pcs cluster edit' xml section, so I can try to push it directly into the cib ? Best Regards,Strahil Nikolov On Tue, Mar 30, 2021 at 19:45, Andrei Borzenkov wrote: On 30.03.2021 17:42, Ken Gaillot wrote: >> >> Colocation does not work, this will force everything

Re: [ClusterLabs] Live migration possible with KSM ?

2021-03-30 Thread Strahil Nikolov
that KSM would be a problem... most probably performance would not be optimal. Best Regards,Strahil Nikolov On Tue, Mar 30, 2021 at 19:47, Andrei Borzenkov wrote: On 30.03.2021 18:16, Lentes, Bernd wrote: > Hi, > > currently i'm reading "Mastering KVM Virtualization",

Re: [ClusterLabs] Antw: [EXT] Colocation per site ?

2021-03-30 Thread Strahil Nikolov
. Yet, I'm not happy to have my own scripts in the cluster's logic. Best Regards,Strahil Nikolov On Tue, Mar 30, 2021 at 10:06, Reid Wahl wrote: You can try the following and see if it works, replacing the items in angle brackets (<>).     # pcs constraint colocation add with

Re: [ClusterLabs] Antw: [EXT] Colocation per site ?

2021-03-30 Thread Strahil Nikolov
Hi Ken, can you provide a prototype code example. Currently,I'm making a script that will be used in a systemd service managed by the cluster.Yet, I would like to avoid non-pacemaker solutions. Best Regards,Strahil Nikolov On Mon, Mar 29, 2021 at 20:12, Ken Gaillot wrote: On Sun, 2021-03

Re: [ClusterLabs] Antw: [EXT] Colocation per site ?

2021-03-28 Thread Strahil Nikolov
I didn't mean DC as a designated coordinator, but as a physical Datecenter location. Last time I checked, the node attributes for all nodes seemed the same.I will verify that tomorrow (Monday). Best Regards,Strahil Nikolov On Fri, Feb 19, 2021 at 16:51, Andrei Borzenkov wrote: On Fri

Re: [ClusterLabs] Which fence agent is needed for an Apache web server cluster?

2021-03-27 Thread Strahil Nikolov
ode of fence_ipmi. With triple sbd , I mean sbd with 3 block devices. Best Regards,Strahil Nikolov On Sat, Mar 27, 2021 at 23:15, Reid Wahl wrote: On Saturday, March 27, 2021, Strahil Nikolov wrote: > My notes: > - ilo ssh fence mechanism is crappy due to ilo itself, try to avoid if &

Re: [ClusterLabs] Which fence agent is needed for an Apache web server cluster?

2021-03-27 Thread Strahil Nikolov
and it never failed us. Yet, it's just a kernel module (no hardware required) and thus RH do not support such setup. If you decide to use 'sbd', disable vendor's system recovery solution (like HPE's ASR) as it will also tinker with the watchdog. Best Regards,Strahil Nikolov On Sat, Mar 27, 2021

Re: [ClusterLabs] Antw: Re: Antw: [EXT] Re: Order set troubles

2021-03-27 Thread Strahil Nikolov
>I also remember something about racing with dnsmasq, at which point I'dsay >that making cluster depend on availability of DNS is e-h-h-h unwise Not my choice... Or at least I would deploy bind/unbound caching servers in the same VLAN instead of dnsmasq.Also, Filesystem resource agent's read +

Re: [ClusterLabs] Antw: Re: Antw: [EXT] Re: Order set troubles

2021-03-26 Thread Strahil Nikolov
Thanks everyone! I really appreciate your help. Actually , I found a RH solution (#5423971) that gave me enough ideas  /it is missing some steps/ to setup the cluster prooperly. So far , I have never used node attributes, order sets and location constraints based on 'ocf:pacemaker: attribute's

Re: [ClusterLabs] Antw: Re: Antw: [EXT] Re: Order set troubles

2021-03-26 Thread Strahil Nikolov
Just a clarification. I'm using separate NFS shares for each HANA, so even if someone wipes the NFS for DC1, the cluster will failover to DC2 (separate NFS) and survive. Best Regards,Strahil Nikolov___ Manage your subscription:

Re: [ClusterLabs] Antw: [EXT] Re: Order set troubles

2021-03-25 Thread Strahil Nikolov
OCF_CHECK_LEVEL 20NFS sometimes fails to start (systemd racing condition with dnsmasq) Best Regards,Strahil Nikolov On Thu, Mar 25, 2021 at 12:18, Andrei Borzenkov wrote: On Thu, Mar 25, 2021 at 10:31 AM Strahil Nikolov wrote: > > Use Case: > > nfsA is shared filesystem for

Re: [ClusterLabs] Antw: [EXT] Re: Order set troubles

2021-03-25 Thread Strahil Nikolov
start on the nodes in site B. I think that it's a valid use case. Best Regards,Strahil Nikolov On Thu, Mar 25, 2021 at 8:59, Ulrich Windl wrote: >>> Ken Gaillot schrieb am 24.03.2021 um 18:56 in Nachricht <5bffded9c6e614919981dcc7d0b2903220bae19d.ca...@redhat.com>: >

[ClusterLabs] Order set troubles

2021-03-24 Thread Strahil Nikolov
, Strahil Nikolov ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Corosync - qdevice not voting

2021-03-19 Thread Strahil Nikolov
If firewalld is available, just try with 'firewall-cmd --panic-on' (or something like that). Best Regards,Strahil Nikolov On Fri, Mar 19, 2021 at 12:50, Marcelo Terres wrote: ___ Manage your subscription: https://lists.clusterlabs.org/mailman

Re: [ClusterLabs] Corosync - qdevice not voting

2021-03-18 Thread Strahil Nikolov
Is there any reason to use lms mode for the qdevice ? Best Regards, Strahil Nikolov ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] maximum token value (knet)

2021-03-12 Thread Strahil Nikolov
for corosync only ? Best Regards,Strahil Nikolov On Fri, Mar 12, 2021 at 17:01, Jan Friesse wrote: Strahil, > Interesting... > Yet, this doesn't explain why token of 3 causes the nodes to never > assemble a cluster (waiting for half an hour, using wait_for_all=1) , while

Re: [ClusterLabs] Q: callback hooks for sbd?

2021-03-11 Thread Strahil Nikolov
fencing mechanism kicks in. Best Regards, Strahil Nikolov В четвъртък, 11 март 2021 г., 19:16:04 ч. Гринуич+2, Klaus Wenninger написа: On 3/11/21 12:30 PM, Ulrich Windl wrote: > Hi! > > I wonder: Is it possible to register some callback to sbd that is called > whenev

Re: [ClusterLabs] maximum token value (knet)

2021-03-11 Thread Strahil Nikolov
. I was hoping that I missed in the documentation about the maximum token size... Best Regards, Strahil Nikolov В четвъртък, 11 март 2021 г., 19:12:58 ч. Гринуич+2, Jan Friesse написа: Strahil, > Hello all, > I'm building a test cluster on RHEL8.2 and I have noticed that the c

[ClusterLabs] maximum token value (knet)

2021-03-11 Thread Strahil Nikolov
Hello all, I'm building a test cluster on RHEL8.2 and I have noticed that the cluster fails to assemble ( nodes stay inquorate as if the network is not working) if I set the token at 3 or more (30s+). What is the maximum token value with knet ?On SLES12 (I think it was  corosync 1) , I used

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-03-03 Thread Strahil Nikolov
When you change the token, you might consider adjusting the consensus timeout (see man corosync.conf). Best Regards,Strahil Nikolov___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home:

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-02-28 Thread Strahil Nikolov
As this is in Asure and they support shared disks , I think that a simple SBD could solve the stonith case. Best Regards,Strahil Nikolov___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home:

Re: [ClusterLabs] Antw: [EXT] Colocation per site ?

2021-02-19 Thread Strahil Nikolov
>Do you have a fixed relation between node >pairs and VIPs? I.e. must >A/D always get VIP1, B/E - VIP2 etc? I have to verify it again, but generally speaking - yes , VIP1 is always on nodeA/D (master), VIP2 on nodeB/E (worker1) , etc. I guess I can set negative constraints (-inf) -> VIP1 on

Re: [ClusterLabs] Antw: [EXT] Colocation per site ?

2021-02-19 Thread Strahil Nikolov
on nodeAVIP2 on nodeBVIP3 on nodeC Master on nodeDVIP1 on nodeDVIP2 on nodeEVIP3 on nodeF Master down:All VIPs down I think that Ken has mentioned a possible solution, bit I have to check it out. Best Regards,Strahil Nikolov On Thu, Feb 18, 2021 at 9:40, Ulrich Windl wrote: >>>

[ClusterLabs] Colocation per site ?

2021-02-17 Thread Strahil Nikolov
Hello All, I'm currently in a process of building SAP HANA Scale-out cluster and the HANA team has asked that all nodes on the active instance should have one IP for backup purposes. Yet, I'm not sure how to setup the constraints (if it is possible at all) so all IPs will follow the master

Re: [ClusterLabs] Antw: [EXT] Feedback wanted: using the systemd message catalog

2021-02-17 Thread Strahil Nikolov
Hi Ulrich, actually you can suppress them. Best Regards,Strahil Nikolov On Wed, Feb 17, 2021 at 13:04, Ulrich Windl wrote: Hi Ken, personally I think systemd is already logging too much, and I don't think that adding instructions to many log messages is actually helpful (It could

Re: [ClusterLabs] Feedback wanted: using the systemd message catalog

2021-02-16 Thread Strahil Nikolov
details use  journalctl -xe'. Isn't it easier to just provide more details in the logs than integrating that feature ? Best Regards,Strahil Nikolov On Tue, Feb 16, 2021 at 21:48, Ken Gaillot wrote: Hi all, The systemd journal has a feature called the message catalog, which allows

Re: [ClusterLabs] Antw: [EXT] weird xml snippet in "crm configure show"

2021-02-12 Thread Strahil Nikolov
WARNING: cib-bootstrap-options: unknown attribute 'no-quirum-policy' That looks like a typo. Best Regards,Strahil Nikolov On Fri, Feb 12, 2021 at 12:30, Lentes, Bernd wrote: - On Feb 12, 2021, at 11:18 AM, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote: > >

Re: [ClusterLabs] Antw: [EXT] Resource migration and constraint timeout

2021-01-28 Thread Strahil Nikolov
a configuration that any migration lifetime is, by default,  '8 hours' ( for example) and afterwards it expires (just like with timeout). Best Regards,Strahil Nikolov Sent from Yahoo Mail on Android On Mon, Jan 25, 2021 at 18:16, Ken Gaillot wrote: On Mon, 2021-01-25 at 13:22 +0100, Ulrich Windl wrote

[ClusterLabs] Resource migration and constraint timeout

2021-01-25 Thread Strahil Nikolov
the defined timeout ? Best Regards,Strahil Nikolov Sent from Yahoo Mail on Android___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] CCIB migration from Pacemaker 1.x to 2.x

2021-01-25 Thread Strahil Nikolov
fence_drac5 , fence_drac (not sure about that) , SBD Best Regards,Strahil Nikolov Sent from Yahoo Mail on Android On Mon, Jan 25, 2021 at 11:23, Sharma, Jaikumar wrote: ___ Manage your subscription: https://lists.clusterlabs.org/mailman

Re: [ClusterLabs] Stopping all nodes causes servers to migrate

2021-01-25 Thread Strahil Nikolov
I think that it makes sense, as '--all' should mean 'reach all servers and shutdown there'.Yet, when you run 'pcs cluster stop' - the migration of the resources is the only option. Still, it sounds like a bug. Best Regards,Strahil Nikolov Sent from Yahoo Mail on Android On Sun, Jan 24

Re: [ClusterLabs] CCIB migration from Pacemaker 1.x to 2.x

2021-01-24 Thread Strahil Nikolov
> How to handle it? You need to : - Setup and TEST stonith - Add a 3rd node (even if it doesn't host any resources) or setup a node for kronosnet Best Regards, Strahil Nikolov ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listi

Re: [ClusterLabs] Antw: Re: Antw: [EXT] DRBD ms resource keeps getting demoted

2021-01-23 Thread Strahil Nikolov
eout=30 (DRBD-reload-interval-0s) start interval=0s timeout=240 (DRBD-start-interval-0s) stop interval=0s timeout=100 (DRBD-stop-interval-0s) Best Regards,Strahil Nikolov В 23:30 -0500 на 21.01.2021 (чт), Stuart Massey написа: > Hi Ulrich, > Thank you for your re

Re: [ClusterLabs] Antw: [EXT] DRBD ms resource keeps getting demoted

2021-01-19 Thread Strahil Nikolov
and I hope it helps you fix your issue. Best Regards,Strahil Nikolov В 09:32 -0500 на 19.01.2021 (вт), Stuart Massey написа: > Ulrich,Thank you for that observation. We share that concern. > We have 4 ea 1G nics active, bonded in pairs. One bonded pair serves > the "public" (to the

Re: [ClusterLabs] CentOS 8 & drbd 9, two drbd devices and colocation

2021-01-19 Thread Strahil Nikolov
So why it is saying 'connecting' ? Best Regards, Strahil Nikolov В понеделник, 18 януари 2021 г., 23:54:02 Гринуич+2, Brent Jensen написа: Yes all works fine outside of the cluster. No firewall running nor any selinux. On 1/18/2021 11:53 AM, Strahil Nikolov wrote: >>&

Re: [ClusterLabs] CentOS 8 & drbd 9, two drbd devices and colocation

2021-01-18 Thread Strahil Nikolov
he firewall open ? This node should be connected. Try to verify that each drbd is up and running and promoting any of the 2 nodes is possible before proceeding with the cluster setup. Best Regards, Strahil Nikolov ___ Manage your subscription: https://lists.cl

Re: [ClusterLabs] [EXT] DRBD 2-node M/S doesn't want to promote new master, Centos 8

2021-01-18 Thread Strahil Nikolov
is causing thetrouble. Best Regards, Strahil Nikolov В събота, 16 януари 2021 г., 17:51:05 Гринуич+2, Brent Jensen написа: Maybe. I haven't focused on any stickiness w/ which node is generally master or not. Going standby on the master node should move the slave to master. I'm just

Re: [ClusterLabs] Completely disabled resource failure triggered fencing

2021-01-18 Thread Strahil Nikolov
Have you tried on-fail=ignore option ? Best Regards, Strahil Nikolov В неделя, 17 януари 2021 г., 20:45:27 Гринуич+2, Digimer написа: Hi all,   I'm trying to figure out how to define a resource such that if it fails in any way, it will not cause pacemaker self self-fence

Re: [ClusterLabs] DRBD 2-node M/S doesn't want to promote new master, Centos 8

2021-01-16 Thread Strahil Nikolov
> > Quote from official documentation ( https://www.linbit.com/drbd-user-guide/drbd-guide-9_0-en/#s-pacemaker-crm-drbd-backed-service ): If you are employing the DRBD OCF resource agent, it is recommended that you defer DRBD startup, shutdown, promotion, and demotion exclusively to the OCF

Re: [ClusterLabs] drbd9 and Centos 8.3

2021-01-08 Thread Strahil Nikolov
oth nodes ? What happens when you stop the cluster resource and start the drbd manually ? I guess it's unnecessary to mention how risky is to run a 2-node cluster and that it's far safer if you have a quorum somewhere there ;) Best Regards, Strahil Nikolov ___

Re: [ClusterLabs] Changing order in resource group after it's created

2020-12-17 Thread Strahil Nikolov
Use the syntax as if your resource was never in a group and use '--before/--after' to specify the new location. Best Regards, Strahil Nikolov В четвъртък, 17 декември 2020 г., 13:21:55 Гринуич+2, Tony Stocker написа: I have a resource group that has a number of entries. If I want

Re: [ClusterLabs] Calling crm executables via effective uid

2020-12-12 Thread Strahil Nikolov
Have you thought about Hawk ? Best Regards, Strahil Nikolov В петък, 11 декември 2020 г., 23:20:49 Гринуич+2, Alex Zarifoglu написа: Hello,   I have question regarding the running crm commands with the effective uid.   I am trying to create a tool to manage pacemaker resources

Re: [ClusterLabs] Q: LVM-activate a shared LV

2020-12-10 Thread Strahil Nikolov
I think that dlm + clvmd was enough to take care of OCFS2 . Have you tried that ? Best Regards, Strahil Nikolov В четвъртък, 10 декември 2020 г., 16:55:52 Гринуич+2, Ulrich Windl написа: Hi! I configured a clustered LV (I think) for activation on three nodes, but it won't work

Re: [ClusterLabs] Cannot allocate memory in pgsql_monitor

2020-12-10 Thread Strahil Nikolov
systemd services do not use ulimit, so you need to check "systemctl show pacemaker.service" for any clues. I have seen similar error in SLES 12 SP2 when the maximum tasks was reduced and we were hitting the limit. Best Regards, Strahil Nikolov В четвъртък, 10 декември 2020 г.

Re: [ClusterLabs] Antw: [EXT] Re: Q: high-priority messages from DLM?

2020-12-08 Thread Strahil Nikolov
Nope, but if you don't use clustered FS, you could also use plain LVM + tags. As far as I know you need dlm and clvmd for clustered FS. Best Regards, Strahil Nikolov В вторник, 8 декември 2020 г., 10:15:39 Гринуич+2, Ulrich Windl написа: >>> Strahil Nikolov schrieb am 0

Re: [ClusterLabs] Q: high-priority messages from DLM?

2020-12-05 Thread Strahil Nikolov
It's more interesting why you got connection close... Are you sure you didn't got network issues ? What is corosync saying in the lgos ? Offtopic: Are you using DLM with OCFS2 ? Best Regards, Strahil Nikolov В 10:33 -0800 на 04.12.2020 (пт), Reid Wahl написа: > On Fri, Dec 4, 2020 at 10:32

Re: [ClusterLabs] Antw: [EXT] Re: Preferred node for a service (not constrained)

2020-12-03 Thread Strahil Nikolov
The problem with infinity is that the moment when the node is back - there will be a second failover. This is bad for bulky DBs that power down/up more than 30 min (15 min down, 15 min up). Best Regards, Strahil Nikolov В четвъртък, 3 декември 2020 г., 10:32:18 Гринуич+2, Andrei Borzenkov

Re: [ClusterLabs] Preferred node for a service (not constrained)

2020-12-02 Thread Strahil Nikolov
node2 to node1 . Note: default stickiness is per resource , while the total stickiness score of a group is calculated based on the scores of all resources in it. Best Regards, Strahil Nikolov В сряда, 2 декември 2020 г., 16:54:43 Гринуич+2, Dan Swartzendruber написа: On 2020-11-30

[ClusterLabs] Question about portblock

2020-11-07 Thread Strahil Nikolov
L 8 ? Best Regards, Strahil Nikolov ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Adding a node to an active cluster

2020-10-28 Thread Strahil Nikolov
in global maintenance before that. You got 2 steps : - pcs cluster auth -> allows the pcsd on the new node to communicate with pcsd daemons on the other members of the cluster - pcs cluster node add -> adds the node to the cluster Best Regards, Strahil Nikolov В сряда, 28 октомври 2020

Re: [ClusterLabs] fence_scsi problem

2020-10-28 Thread Strahil Nikolov
integrates with a watchdog service to reboot the node that was fenced). Best Regards, Strahil Nikolov В сряда, 28 октомври 2020 г., 14:18:05 Гринуич+2, Patrick Vranckx написа: Hi, I try yo setup an HA cluster for ZFS. I think fence_scsi is not working properly. I can reproduce the prob

Re: [ClusterLabs] Unable to connect to node , no token available

2020-10-28 Thread Strahil Nikolov
Are you sure that the node you want to join is not one of the listed here: Online: [ SRVDRSW01 SRVDRSW02 SRVDRSW03 SRVDRSW04 SRVDRSW05 SRVDRSW06 ] The hostname looks pretty much the same. Best Regards, Strahil Nikolov В сряда, 28 октомври 2020 г., 11:18:11 Гринуич+2, Raffaele Pantaleoni

Re: [ClusterLabs] Adding a node to an active cluster

2020-10-27 Thread Strahil Nikolov
nd-clusters_considerations-in-adopting-rhel-8#new_commands_for_authenticating_nodes_in_a_cluster Best Regards, Strahil Nikolov В вторник, 27 октомври 2020 г., 18:06:06 Гринуич+2, Jiaqi Tian1 написа: Hi Xin, Thank you. The crmsh version is 4.1.0.0, OS is RHEL 8.0.   I have t

Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] Re: VirtualDomain does not stop via "crm resource stop" - modify RA ?

2020-10-27 Thread Strahil Nikolov
Ulrich, do you mean '--queue' ? Best Regards, Strahil Nikolov В вторник, 27 октомври 2020 г., 12:15:16 Гринуич+2, Ulrich Windl написа: >>> "Lentes, Bernd" schrieb am 26.10.2020 um 21:44 in Nachricht <1480408662.7194527.1603745092927.javamail.zim...

Re: [ClusterLabs] Antw: [EXT] Re: Upgrading/downgrading cluster configuration

2020-10-26 Thread Strahil Nikolov
>>> Strahil Nikolov schrieb am 23.10.2020 um 17:04 in Nachricht <362944335.2019534.1603465466...@mail.yahoo.com>: > Usually I prefer to use "crm configure show" and later "crm configure edit" > and replace the config. >I guess you use "

Re: [ClusterLabs] Antw: [EXT] Re: VirtualDomain does not stop via "crm resource stop" - modify RA ?

2020-10-26 Thread Strahil Nikolov
I think it's useful - for example a HANA powers up for 10-15min (even more , depends on storage tier) - so the default will time out and the fun starts there. Maybe the cluster is just showing them without using them , but it looked quite the opposite. Best Regards, Strahil Nikolov В

Re: [ClusterLabs] VirtualDomain does not stop via "crm resource stop" - modify RA ?

2020-10-23 Thread Strahil Nikolov
why don't you work with something like this: 'op stop interval =300 timeout=600'. The stop operation will timeout at your requirements without modifying the script. Best Regards, Strahil Nikolov В четвъртък, 22 октомври 2020 г., 23:30:08 Гринуич+3, Lentes, Bernd написа: Hi guys

Re: [ClusterLabs] Upgrading/downgrading cluster configuration

2020-10-23 Thread Strahil Nikolov
Usually I prefer to use "crm configure show" and later "crm configure edit" and replace the config. I am not sure if this will work with such downgrade scenario, but it shouldn't be a problem. Best Regards, Strahil Nikolov В четвъртък, 22 октомври 2020 г., 21:30:

Re: [ClusterLabs] Upgrading/downgrading cluster configuration

2020-10-22 Thread Strahil Nikolov
Have you tried to backup the config via crmsh/pcs and when you downgrade to restore from it ? Best Regards, Strahil Nikolov В четвъртък, 22 октомври 2020 г., 15:40:43 Гринуич+3, Vitaly Zolotusky написа: Hello, We are trying to upgrade our product from Corosync 2.X to Corosync 3.X

Re: [ClusterLabs] Adding a node to an active cluster

2020-10-21 Thread Strahil Nikolov
Both SUSE and RedHat provide utilities to add the node without messing with the configs manually. What is your distro ? Best Regards, Strahil Nikolov В сряда, 21 октомври 2020 г., 17:03:19 Гринуич+3, Jiaqi Tian1 написа: Hi, I'm trying to add a new node into an active pacemaker

Re: [ClusterLabs] Maintenance mode status in CIB

2020-10-13 Thread Strahil Nikolov
Yep , both work without affecting the resources : crm cluster stop pcs cluster stop  Once your maintenance is over , you can start the cluster and everything will be back in maintenance. Best Regards, Strahil Nikolov В вторник, 13 октомври 2020 г., 19:15:27 Гринуич+3, Digimer написа

Re: [ClusterLabs] Maintenance mode status in CIB

2020-10-13 Thread Strahil Nikolov
Also, it's worth mentioning that you can set the whole cluster in global maintenance and power off the stack on all nodes without affecting your resources. I'm not sure if that is ever possible in node maintenance. Best Regards, Strahil Nikolov В вторник, 13 октомври 2020 г., 12:49:38

Re: [ClusterLabs] Two ethernet adapter within same subnet causing issue on Qdevice

2020-10-06 Thread Strahil Nikolov
I agree, it's more of a routing problem. Actually a static route should fix the issue. Best Regards, Strahil Nikolov В вторник, 6 октомври 2020 г., 10:50:24 Гринуич+3, Jan Friesse написа: Richard , > To clarify my problem, this is more on Qdevice issue I want to fix. The quest

Re: [ClusterLabs] Open Source Linux Load Balancer with HA and Split Brain Prevention?

2020-10-04 Thread Strahil Nikolov
in the web , haproxy has a ready-to-go resource agent 'ocf:heartbeat:haproxy' , so you can give it a try. Best Regards, Strahil Nikolov В неделя, 4 октомври 2020 г., 22:41:59 Гринуич+3, Eric Robinson написа:    Greetings!   We are looking for an open-source Linux load-balancing

Re: [ClusterLabs] How to stop removed resources when replacing cib.xml via cibadmin or crm_shadow

2020-10-01 Thread Strahil Nikolov
That's the strangest request I have heard so far ... What is the reason not to use crmsh or pcs to manage the cluster ? About your question , have you tried to load a cib with the old resources stopped and then another one with the stopped resources removed ? Best Regards, Strahil Nikolov

Re: [ClusterLabs] Resources always return to original node

2020-09-26 Thread Strahil Nikolov
Resource Stickiness for a group is the sum of all resources' resource stikiness -> 5 resources x 100 score (default stickiness) = 500 score. If your location constraint has a bigger number -> it wins :) Best Regards, Strahil Nikolov В събота, 26 септември 2020 г., 12:22:32 Гри

Re: [ClusterLabs] Is the "allow_downscale" option supported by Corosync/Pacemaker?

2020-09-25 Thread Strahil Nikolov
I would use the 'last_man_standing: 1' + 'wait_for_all: 1'. When you shoutdown a node gracefully , the quorum is recalculated. You can check the manpage for explanation. Best Regards, Strahil Nikolov В петък, 25 септември 2020 г., 01:19:09 Гринуич+3, Philippe M Stedman написа: Hi

Re: [ClusterLabs] Pacemaker not starting

2020-09-23 Thread Strahil Nikolov
What is the output of 'corosync-quorumtool -s' on both nodes ? What is your cluster's configuration : 'crm configure show' or 'pcs config' Best Regards, Strahil Nikolov В сряда, 23 септември 2020 г., 16:07:16 Гринуич+3, Ambadas Kawle написа: Hello All We have 2 node with Mysql

<    1   2   3   >