Re: [ClusterLabs] 2 Node Active-Passive DRBD , fallback fencing issues.

2018-08-17 Thread Mark Adams
Read here:
http://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/_prevent_resources_from_moving_after_recovery.html

On 17 August 2018 at 16:43, Jayesh Shinde 
wrote:

> Hello All ,
>
> I have configured 2 node Active - Passive DRBD cluster with HP ProLiant
> DL380 Gen9  , CentOS 7.3  Pacemaker + corosync + pcsd  + fence_ilo4_ssh
>
> When I am rebooting "Active server"  all resources are getting move the
> "Slave server"  and starting all respective services properly.
> But when "Active server" boot again  then it fencing "Slave server" ( i.e
> reboot ) and all resources are getting fall back to "Active server" .
>
> I want to know whether its default behaviors or issue .
>
> My requirement is once "Master" switch to "Slave" , then all services need
> to work from "Slave" only.
> If in-case "Slave server" down or reboot , then "Master Server" should
> take care of all resources and vice-versa ...
>
> Please guide.
>
> *Below are config details *
>
> [root@master ~]# pcs property list --all|grep stonith
>  stonith-action: reboot
>  stonith-enabled: true
>  stonith-max-attempts: 10
>  stonith-timeout: 60s
>  stonith-watchdog-timeout: (null)
>
> [root@master ~]# pcs property list --all | grep no-quorum-policy
>  no-quorum-policy: ignore
>
> [root@master ~]# pcs resource defaults
> resource-stickiness: 100
>
> *Below is Package Information* :--
>
> kernel-3.10.0-862.9.1.el7.x86_64
>
> pacemaker-libs-1.1.18-11.el7_5.3.x86_64
> corosync-2.4.3-2.el7_5.1.x86_64
> pacemaker-cli-1.1.18-11.el7_5.3.x86_64
> pacemaker-cluster-libs-1.1.18-11.el7_5.3.x86_64
> pacemaker-1.1.18-11.el7_5.3.x86_64
> corosynclib-2.4.3-2.el7_5.1.x86_64
>
> drbd90-utils-9.3.1-1.el7.elrepo.x86_64
> kmod-drbd90-9.0.14-1.el7_5.elrepo.x86_64
>
>
> Regards
> Jayesh Shinde
> /
> 
>
> ___
> Users mailing list: Users@clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] 2 Node Active-Passive DRBD , fallback fencing issues.

2018-08-17 Thread FeldHost™ Admin
Would you share your corosync.conf and pacemaker config (pcs cluster cib)?

S pozdravem Kristián Feldsam
Tel.: +420 773 303 353, +421 944 137 535
E-mail.: supp...@feldhost.cz

www.feldhost.cz - FeldHost™ – Hostingové služby prispôsobíme vám. Máte 
špecifické požiadavky? Poradíme si s nimi.

FELDSAM s.r.o.
V rohu 434/3
Praha 4 – Libuš, PSČ 142 00
IČ: 290 60 958, DIČ: CZ290 60 958
C 200350 vedená u Městského soudu v Praze

Banka: Fio banka a.s.
Číslo účtu: 2400330446/2010
BIC: FIOBCZPPXX
IBAN: CZ82 2010  0024 0033 0446

> On 17 Aug 2018, at 17:43, Jayesh Shinde  wrote:
> 
> Hello All , 
> 
> I have configured 2 node Active - Passive DRBD cluster with HP ProLiant DL380 
> Gen9  , CentOS 7.3  Pacemaker + corosync + pcsd  + fence_ilo4_ssh 
> 
> When I am rebooting "Active server"  all resources are getting move the 
> "Slave server"  and starting all respective services properly. 
> But when "Active server" boot again  then it fencing "Slave server" ( i.e 
> reboot ) and all resources are getting fall back to "Active server" . 
> 
> I want to know whether its default behaviors or issue .   
> 
> My requirement is once "Master" switch to "Slave" , then all services need to 
> work from "Slave" only.  
> If in-case "Slave server" down or reboot , then "Master Server" should take 
> care of all resources and vice-versa ...
> 
> Please guide. 
> 
> Below are config details 
> 
> [root@master ~]# pcs property list --all|grep stonith
>  stonith-action: reboot
>  stonith-enabled: true
>  stonith-max-attempts: 10
>  stonith-timeout: 60s
>  stonith-watchdog-timeout: (null)
> 
> [root@master ~]# pcs property list --all | grep no-quorum-policy
>  no-quorum-policy: ignore
> 
> [root@master ~]# pcs resource defaults
> resource-stickiness: 100
> 
> Below is Package Information :-- 
> 
> kernel-3.10.0-862.9.1.el7.x86_64
> 
> pacemaker-libs-1.1.18-11.el7_5.3.x86_64
> corosync-2.4.3-2.el7_5.1.x86_64
> pacemaker-cli-1.1.18-11.el7_5.3.x86_64
> pacemaker-cluster-libs-1.1.18-11.el7_5.3.x86_64
> pacemaker-1.1.18-11.el7_5.3.x86_64
> corosynclib-2.4.3-2.el7_5.1.x86_64
> 
> drbd90-utils-9.3.1-1.el7.elrepo.x86_64
> kmod-drbd90-9.0.14-1.el7_5.elrepo.x86_64
> 
> 
> Regards
> Jayesh Shinde
> / 
> ___
> Users mailing list: Users@clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] 2 Node Active-Passive DRBD , fallback fencing issues.

2018-08-17 Thread Jayesh Shinde

Hello All ,

I have configured 2 node Active - Passive DRBD cluster with HP ProLiant 
DL380 Gen9  , CentOS 7.3  Pacemaker + corosync + pcsd  + fence_ilo4_ssh


When I am rebooting "Active server"  all resources are getting move the 
"Slave server"  and starting all respective services properly.
But when "Active server" boot again  then it fencing "Slave server" ( 
i.e reboot ) and all resources are getting fall back to "Active server" .


I want to know whether its default behaviors or issue .

My requirement is once "Master" switch to "Slave" , then all services 
need to work from "Slave" only.
If in-case "Slave server" down or reboot , then "Master Server" should 
take care of all resources and vice-versa ...


Please guide.

_Below are config details _

[root@master ~]# pcs property list --all|grep stonith
 stonith-action: reboot
 stonith-enabled: true
 stonith-max-attempts: 10
 stonith-timeout: 60s
 stonith-watchdog-timeout: (null)

[root@master ~]# pcs property list --all | grep no-quorum-policy
 no-quorum-policy: ignore

[root@master ~]# pcs resource defaults
resource-stickiness: 100

_Below is Package Information_ :--

kernel-3.10.0-862.9.1.el7.x86_64

pacemaker-libs-1.1.18-11.el7_5.3.x86_64
corosync-2.4.3-2.el7_5.1.x86_64
pacemaker-cli-1.1.18-11.el7_5.3.x86_64
pacemaker-cluster-libs-1.1.18-11.el7_5.3.x86_64
pacemaker-1.1.18-11.el7_5.3.x86_64
corosynclib-2.4.3-2.el7_5.1.x86_64

drbd90-utils-9.3.1-1.el7.elrepo.x86_64
kmod-drbd90-9.0.14-1.el7_5.elrepo.x86_64


Regards
Jayesh Shinde

https://martechmarathon.com/videos/martech-marathon-12x25-expert-panel-on-marketing-for-travel-retail-ecommerce-industries/

___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Using the max_network_delay option of the corosync

2018-08-17 Thread Jan Friesse

Vladislav,

Hello!
I have the cluster based on pacemaker and corosync. The cluster nodes are
located in several data centers. And there is up to 450 ms ping latency
between some of the nodes. Sometimes these delays lead to split brains.



Yep, that's quite a lot. Enlarge token timeout so something like:

totem {
...
token: 5000
...
}

will help


I have found themax_network_delay option for the corosync.


This options does nothing until heartbeat_failures_allowed is set. I 
don't know if you have it enabled, but I would recommend to not enable it.


Regards,
  Honza



But there is a warning:
It is not recommended to override this value without guidance from the
corosync
community.

Can anyone help me with this?

PS:
https://serverfault.com/questions/926548/using-the-max-network-delay-option-of-the-corosync



___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Using the max_network_delay option of the corosync

2018-08-17 Thread Vladislav Lutsenko
Hello!
I have the cluster based on pacemaker and corosync. The cluster nodes are
located in several data centers. And there is up to 450 ms ping latency
between some of the nodes. Sometimes these delays lead to split brains.

I have found themax_network_delay option for the corosync.

But there is a warning:
It is not recommended to override this value without guidance from the
corosync
community.

Can anyone help me with this?

PS:
https://serverfault.com/questions/926548/using-the-max-network-delay-option-of-the-corosync
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org