Re: [ClusterLabs] DRBD split-brain investigations, automatic fixes and manual intervention...

2021-10-21 Thread Ian Diddams via Users
On Wednesday, 20 October 2021, 18:08:50 BST, Andrei Borzenkov wrote: It depends on what hardware you have. For physical systems IPMI may be available or managed power outlets; both allow cutting power to another node over LAN. For virtual machines you may use fencing agent that contacts

Re: [ClusterLabs] DRBD split-brain investigations, automatic fixes and manual intervention...

2021-10-20 Thread Andrei Borzenkov
On 20.10.2021 17:54, Ian Diddams wrote: > > > On Wednesday, 20 October 2021, 11:15:48 BST, Andrei Borzenkov > wrote: > > >> You cannot resolve split brain without fencing. This is as simple as >> that. Your pacemaker configuration (from another mail) shows > >> pcs -f clust_cfg prop

Re: [ClusterLabs] DRBD split-brain investigations, automatic fixes and manual intervention...

2021-10-20 Thread Ian Diddams via Users
On Wednesday, 20 October 2021, 11:15:48 BST, Andrei Borzenkov wrote: >You cannot resolve split brain without fencing. This is as simple as >that. Your pacemaker configuration (from another mail) shows > pcs -f clust_cfg property set stonith-enabled=false > pcs -f clust_cfg property

Re: [ClusterLabs] DRBD split-brain investigations, automatic fixes and manual intervention...

2021-10-20 Thread Andrei Borzenkov
On Wed, Oct 20, 2021 at 11:54 AM Ian Diddams via Users wrote: > > So - system logs recently show this > > ESTRELA > Oct 18th > Oct 18 04:04:28 wp-vldyn-estrela kernel: [584651.491139] drbd mysql01/0 > drbd0: Split-Brain detected, 1 primaries, automatically solved. Sync from > peer node > Oct 18

[ClusterLabs] DRBD split-brain investigations, automatic fixes and manual intervention...

2021-10-20 Thread Ian Diddams via Users
I've been testing an implementation of a HA mysql cluster for a few months now. I came to this project with no preior knoweldge of what was copncerned/needed and have learned orgainscally via various online how-tos and web sites which many cases wrere slightly out-of-date to missing large chunks

Re: [ClusterLabs] DRBD Split brain

2018-01-20 Thread Digimer
On 2018-01-19 01:36 PM, Ken Gaillot wrote: > On Tue, 2017-12-12 at 15:30 +0200, Антон Сацкий wrote: >> Hi list  >> Need your help. >> Got 2  servers use Pacemaker  Corosync Drbd >> >> [root@voipserver ~]# pcs config >> Cluster Name: ClusterKrusher >> Corosync Nodes: >>  voipserver.primary voipserve

Re: [ClusterLabs] DRBD Split brain

2018-01-19 Thread Ken Gaillot
On Tue, 2017-12-12 at 15:30 +0200, Антон Сацкий wrote: > Hi list  > Need your help. > Got 2  servers use Pacemaker  Corosync Drbd > > [root@voipserver ~]# pcs config > Cluster Name: ClusterKrusher > Corosync Nodes: >  voipserver.primary voipserver.backup > Pacemaker Nodes: >  voipserver.backup voi

[ClusterLabs] DRBD Split brain

2017-12-12 Thread Антон Сацкий
Hi list Need your help. Got 2 servers use Pacemaker Corosync Drbd [root@voipserver ~]# pcs config Cluster Name: ClusterKrusher Corosync Nodes: voipserver.primary voipserver.backup Pacemaker Nodes: voipserver.backup voipserver.primary Resources: Resource: ClusterIP (class=ocf provider=heartbe

Re: [ClusterLabs] DRBD split brain after Cluster node recovery

2017-07-14 Thread ArekW
I have active-active NSF on gfs2 on dual-primary DRBD. As for stonith I have no reason to think that it is not working because when I poweroff one node, the second node can bring it up automatically. Also I have situation where fence shut down or reboot another node. It seems than it is working wel

Re: [ClusterLabs] DRBD split brain after Cluster node recovery

2017-07-14 Thread Dmitri Maziuk
On 7/14/2017 3:57 AM, ArekW wrote: Hi, I have stonith run and tested. The problem was that there is mistake in drbd documentation. The 'fencing' belongs to net (not disk). If you are running NFS on top of a dual-primary DRBD with some sort of a cluster filesystem, I'd think *that* is your prob

Re: [ClusterLabs] DRBD split brain after Cluster node recovery

2017-07-14 Thread ArekW
Hi, I have stonith run and tested. The problem was that there is mistake in drbd documentation. The 'fencing' belongs to net (not disk). I also try to use more handlers handlers { fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh"; spl

Re: [ClusterLabs] DRBD split brain after Cluster node recovery

2017-07-12 Thread Digimer
On 2017-07-12 07:19 AM, ArekW wrote: > Thanks for quick respond. I've found the fencing docs on Linbit and try > to follow but I got an error regarding "disk" section: > > drbd.d/global_common.conf:57: Parse error: 'resync-rate | c-plan-ahead | > c-delay-target | c-fill-target | c-max-rate | c-min

Re: [ClusterLabs] DRBD split brain after Cluster node recovery

2017-07-12 Thread ArekW
Thanks for quick respond. I've found the fencing docs on Linbit and try to follow but I got an error regarding "disk" section: drbd.d/global_common.conf:57: Parse error: 'resync-rate | c-plan-ahead | c-delay-target | c-fill-target | c-max-rate | c-min-rate | bitmap' expected, but got 'fenc

Re: [ClusterLabs] DRBD split brain after Cluster node recovery

2017-07-12 Thread Dmitri Maziuk
On 7/12/2017 4:33 AM, ArekW wrote: Hi, Can in be fixed that the drbd is entering split brain after cluster node recovery? I always configure "after-sb*" handlers and drbd-level fence but I never ran it with allow-two-primaries. You'll have read the fine manual on how that works in a dual-prim

Re: [ClusterLabs] DRBD split brain after Cluster node recovery

2017-07-12 Thread emmanuel segura
you need to configure cluster fencing and drbd fencing handler, in this way, the cluster can recevory without manual intervention. 2017-07-12 11:33 GMT+02:00 ArekW : > Hi, > Can in be fixed that the drbd is entering split brain after cluster > node recovery? After few tests I saw drbd recovered b

[ClusterLabs] DRBD split brain after Cluster node recovery

2017-07-12 Thread ArekW
Hi, Can in be fixed that the drbd is entering split brain after cluster node recovery? After few tests I saw drbd recovered but in most situations (9/10) it didn't sync. 1. When a node is put to standby and than unstandby everything is working OK. The drbd is syncing and go to primary mode. 2. Wh