Re: [ClusterLabs] IPaddr2 RA and bonding

2017-08-12 Thread Tomer Azran
I created a pull request with my code:
https://github.com/ClusterLabs/pacemaker/pull/1319


-Original Message-
From: Ken Gaillot [mailto:kgail...@redhat.com] 
Sent: Thursday, August 10, 2017 5:33 PM
To: users@clusterlabs.org
Subject: Re: [ClusterLabs] IPaddr2 RA and bonding

On Thu, 2017-08-10 at 11:02 +, Tomer Azran wrote:
> That looks exactly what I needed - it works.
> I had to change the RA since I don't want to give an interface name as a 
> parameter (it might change from server to server and I want to create a 
> cloned resource).
> I changed the RA a little bit to be able to guess the interface name based on 
> a IP address parameter.
> The new RA is published on my github repo: 
> https://github.com/tomerazran/Pacemaker-Resource-Agents/blob/master/ip
> speed

Nice! Feel free to open a PR against the ClusterLabs/pacemaker repository with 
your changes. You could make it so the user has to specify one of iface or ip, 
or you could have another parameter iface_from_ip=true/false and put the IP in 
iface.

> Just to document the solution in case anyone will need it also, I run the 
> following setup:
> 
>  # pcs resource create vip ocf:heartbeat:IPaddr2 ip=192.168.1.3 op 
> monitor interval=30  # pcs resource create vip_speed 
> ocf:heartbeat:ipspeed ip=192.168.1.3 name=vip_speed op monitor 
> interval=5s --clone  # pcs constraint location vip rule 
> score=-INFINITY vip_speed lt 1 or not_defined vip_speed
> 
> Thank you for the support,
> Tomer.
> 
> 
> -Original Message-
> From: Vladislav Bogdanov [mailto:bub...@hoster-ok.com]
> Sent: Monday, August 7, 2017 9:22 PM
> To: users@clusterlabs.org
> Subject: Re: [ClusterLabs] IPaddr2 RA and bonding
> 
> 07.08.2017 20:39, Tomer Azran wrote:
> > I don't want to use this approach since I don't want to be depend on 
> > pinging to other host or couple of hosts.
> > Is there any other solution?
> > I'm thinking of writing a simple script that will take a bond down 
> > using ifdown command when there are no slaves available and put it 
> > on /sbin/ifdown-local
> 
> For the similar purpose I wrote and use this one - 
> https://github.com/ClusterLabs/pacemaker/blob/master/extra/resources/i
> fspeed
> 
> It sets a node attribute on which other resources may depend via 
> location constraint  - 
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explaine
> d/ch08.html#ch-rules
> 
> It is not installed by default, and that should probably be fixed.
> 
> That RA supports bonds (and bridges), and even tries to guess actual 
> resulting bond speed based on a bond type. For load-balancing bonds like LACP 
> (mode 4) one it uses coefficient of 0.8 (iirc) to reflect actual possible 
> load via multiple links.
> 
> >
> >
> > -Original Message-
> > From: Ken Gaillot [mailto:kgail...@redhat.com]
> > Sent: Monday, August 7, 2017 7:14 PM
> > To: Cluster Labs - All topics related to open-source clustering 
> > welcomed <users@clusterlabs.org>
> > Subject: Re: [ClusterLabs] IPaddr2 RA and bonding
> >
> > On Mon, 2017-08-07 at 10:02 +, Tomer Azran wrote:
> >> Hello All,
> >>
> >>
> >>
> >> We are using CentOS 7.3 with pacemaker in order to create a cluster.
> >>
> >> Each cluster node ha a bonding interface consists of two nics.
> >>
> >> The cluster has an IPAddr2 resource configured like that:
> >>
> >>
> >>
> >> # pcs resource show cluster_vip
> >>
> >> Resource: cluster_vip (class=ocf provider=heartbeat type=IPaddr2)
> >>
> >>   Attributes: ip=192.168.1.3
> >>
> >>   Operations: start interval=0s timeout=20s (cluster_vip
> >> -start-interval-0s)
> >>
> >>   stop interval=0s timeout=20s (cluster_vip
> >> -stop-interval-0s)
> >>
> >>   monitor interval=30s (cluster_vip
> >> -monitor-interval-30s)
> >>
> >>
> >>
> >>
> >>
> >> We are running tests and want to simulate a state when the network 
> >> links are down.
> >>
> >> We are pulling both network cables from the server.
> >>
> >>
> >>
> >> The problem is that the resource is not marked as failed, and the 
> >> faulted node keep holding it and does not fail it over to the other 
> >> node.
> >>
> >> I think that the problem is within the bond interface. The bond 
> >> interface is marked as UP on the OS. It even can ping itself:
> >>
> >>
> >>
> >> # ip link show
> >>
> >> 2:

Re: [ClusterLabs] IPaddr2 RA and bonding

2017-08-10 Thread Tomer Azran
That looks exactly what I needed - it works.
I had to change the RA since I don't want to give an interface name as a 
parameter (it might change from server to server and I want to create a cloned 
resource).
I changed the RA a little bit to be able to guess the interface name based on a 
IP address parameter.
The new RA is published on my github repo: 
https://github.com/tomerazran/Pacemaker-Resource-Agents/blob/master/ipspeed 

Just to document the solution in case anyone will need it also, I run the 
following setup:

 # pcs resource create vip ocf:heartbeat:IPaddr2 ip=192.168.1.3 op monitor 
interval=30
 # pcs resource create vip_speed ocf:heartbeat:ipspeed ip=192.168.1.3 
name=vip_speed op monitor interval=5s --clone
 # pcs constraint location vip rule score=-INFINITY vip_speed lt 1 or 
not_defined vip_speed

Thank you for the support,
Tomer.


-Original Message-
From: Vladislav Bogdanov [mailto:bub...@hoster-ok.com] 
Sent: Monday, August 7, 2017 9:22 PM
To: users@clusterlabs.org
Subject: Re: [ClusterLabs] IPaddr2 RA and bonding

07.08.2017 20:39, Tomer Azran wrote:
> I don't want to use this approach since I don't want to be depend on pinging 
> to other host or couple of hosts.
> Is there any other solution?
> I'm thinking of writing a simple script that will take a bond down 
> using ifdown command when there are no slaves available and put it on 
> /sbin/ifdown-local

For the similar purpose I wrote and use this one - 
https://github.com/ClusterLabs/pacemaker/blob/master/extra/resources/ifspeed

It sets a node attribute on which other resources may depend via location 
constraint  - 
http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch08.html#ch-rules

It is not installed by default, and that should probably be fixed.

That RA supports bonds (and bridges), and even tries to guess actual resulting 
bond speed based on a bond type. For load-balancing bonds like LACP (mode 4) 
one it uses coefficient of 0.8 (iirc) to reflect actual possible load via 
multiple links.

>
>
> -Original Message-
> From: Ken Gaillot [mailto:kgail...@redhat.com]
> Sent: Monday, August 7, 2017 7:14 PM
> To: Cluster Labs - All topics related to open-source clustering 
> welcomed <users@clusterlabs.org>
> Subject: Re: [ClusterLabs] IPaddr2 RA and bonding
>
> On Mon, 2017-08-07 at 10:02 +, Tomer Azran wrote:
>> Hello All,
>>
>>
>>
>> We are using CentOS 7.3 with pacemaker in order to create a cluster.
>>
>> Each cluster node ha a bonding interface consists of two nics.
>>
>> The cluster has an IPAddr2 resource configured like that:
>>
>>
>>
>> # pcs resource show cluster_vip
>>
>> Resource: cluster_vip (class=ocf provider=heartbeat type=IPaddr2)
>>
>>   Attributes: ip=192.168.1.3
>>
>>   Operations: start interval=0s timeout=20s (cluster_vip
>> -start-interval-0s)
>>
>>   stop interval=0s timeout=20s (cluster_vip
>> -stop-interval-0s)
>>
>>   monitor interval=30s (cluster_vip 
>> -monitor-interval-30s)
>>
>>
>>
>>
>>
>> We are running tests and want to simulate a state when the network 
>> links are down.
>>
>> We are pulling both network cables from the server.
>>
>>
>>
>> The problem is that the resource is not marked as failed, and the 
>> faulted node keep holding it and does not fail it over to the other 
>> node.
>>
>> I think that the problem is within the bond interface. The bond 
>> interface is marked as UP on the OS. It even can ping itself:
>>
>>
>>
>> # ip link show
>>
>> 2: eno3: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq 
>> master bond1 state DOWN mode DEFAULT qlen 1000
>>
>> link/ether 00:1e:67:f6:5a:8a brd ff:ff:ff:ff:ff:ff
>>
>> 3: eno4: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq 
>> master bond1 state DOWN mode DEFAULT qlen 1000
>>
>> link/ether 00:1e:67:f6:5a:8a brd ff:ff:ff:ff:ff:ff
>>
>> 9: bond1: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc 
>> noqueue state DOWN mode DEFAULT qlen 1000
>>
>> link/ether 00:1e:67:f6:5a:8a brd ff:ff:ff:ff:ff:ff
>>
>>
>>
>> As far as I understand the IPaddr2 RA does not check the link state 
>> of the interface – What can be done?
>
> You are correct. The IP address itself *is* up, even if the link is down, and 
> it can be used locally on that host.
>
> If you want to monitor connectivity to other hosts, you have to do that 
> separately. The most common approach is to use the ocf:pacemaker:ping 
> resource. See:
>
> http://clusterlabs.or

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Tomer Azran
So your suggestion is to use sbd with or without qdevice? What is the point of 
having a qdevice in two node cluster if it doesn't help in this situation?


From: Klaus Wenninger
Sent: Monday, July 24, 18:28
Subject: Re: [ClusterLabs] Two nodes cluster issue
To: Cluster Labs - All topics related to open-source clustering welcomed, Tomer 
Azran


On 07/24/2017 05:15 PM, Tomer Azran wrote:
I still don't understand why the qdevice concept doesn't help on this 
situation. Since the master node is down, I would expect the quorum to declare 
it as dead.
Why doesn't it happens?

That is not how quorum works. It just limits the decision-making to the quorate 
subset of the cluster.
Still the unknown nodes are not sure to be down.
That is why I suggested to have quorum-based watchdog-fencing with sbd.
That would assure that within a certain time all nodes of the non-quorate part
of the cluster are down.




On Mon, Jul 24, 2017 at 4:15 PM +0300, "Dmitri Maziuk" 
<dmitri.maz...@gmail.com<mailto:dmitri.maz...@gmail.com>> wrote:

On 2017-07-24 07:51, Tomer Azran wrote: > We don't have the ability to use it. 
> Is that the only solution? No, but I'd recommend thinking about it first. Are 
you sure you will care about your cluster working when your server room is on 
fire? 'Cause unless you have halon suppression, your server room is a complete 
write-off anyway. (Think water from sprinklers hitting rich chunky volts in the 
servers.) Dima ___ Users mailing 
list: Users@clusterlabs.org<mailto:Users@clusterlabs.org> 
http://lists.clusterlabs.org/mailman/<http://lists.clusterlabs.org/mailman/listinfo/users>listinfo<http://lists.clusterlabs.org/mailman/listinfo/users>/users<http://lists.clusterlabs.org/mailman/listinfo/users>
 Project Home: http://www.clusterlabs.org Getting started: 
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: 
http://bugs.clusterlabs.org


___ Users mailing list: 
Users@clusterlabs.org<mailto:Users@clusterlabs.org> 
http://lists.clusterlabs.org/mailman/<http://lists.clusterlabs.org/mailman/listinfo/users>listinfo<http://lists.clusterlabs.org/mailman/listinfo/users>/users<http://lists.clusterlabs.org/mailman/listinfo/users>
 Project Home: http://www.clusterlabs.org Getting started: 
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: 
http://bugs.clusterlabs.org

-- Klaus Wenninger Senior Software Engineer, EMEA ENG Openstack Infrastructure 
Red Hat 
kwenning<mailto:kwenn...@redhat.com>@redhat.com<mailto:kwenn...@redhat.com>

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Tomer Azran
I still don't understand why the qdevice concept doesn't help on this 
situation. Since the master node is down, I would expect the quorum to declare 
it as dead.
Why doesn't it happens?



On Mon, Jul 24, 2017 at 4:15 PM +0300, "Dmitri Maziuk" 
<dmitri.maz...@gmail.com<mailto:dmitri.maz...@gmail.com>> wrote:


On 2017-07-24 07:51, Tomer Azran wrote:
> We don't have the ability to use it.
> Is that the only solution?

No, but I'd recommend thinking about it first. Are you sure you will
care about your cluster working when your server room is on fire? 'Cause
unless you have halon suppression, your server room is a complete
write-off anyway. (Think water from sprinklers hitting rich chunky volts
in the servers.)

Dima

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Tomer Azran
I tend to agree with Klaus – I don't think that having a hook that bypass 
stonith is the right way. It is better to not use stonith at all.
I think I will try to use an iScsi target on my qdevice and set SBD to use it.
I still don't understand why qdevice can't take the place SBD with shared 
storage; correct me if I'm wrong, but it looks like both of them are there for 
the same reason.

From: Klaus Wenninger [mailto:kwenn...@redhat.com]
Sent: Monday, July 24, 2017 9:01 PM
To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org>; Prasad, Shashank <sspra...@vanu.com>
Subject: Re: [ClusterLabs] Two nodes cluster issue

On 07/24/2017 07:32 PM, Prasad, Shashank wrote:
Sometimes IPMI fence devices use shared power of the node, and it cannot be 
avoided.
In such scenarios the HA cluster is NOT able to handle the power failure of a 
node, since the power is shared with its own fence device.
The failure of IPMI based fencing can also exist due to other reasons also.

A failure to fence the failed node will cause cluster to be marked UNCLEAN.
To get over it, the following command needs to be invoked on the surviving node.

pcs stonith confirm  --force

This can be automated by hooking a recovery script, when the the Stonith 
resource ‘Timed Out’ event.
To be more specific, the Pacemaker Alerts can be used for watch for Stonith 
timeouts and failures.
In that script, all that’s essentially to be executed is the aforementioned 
command.

If I get you right here you can disable fencing then in the first place.
Actually quorum-based-watchdog-fencing is the way to do this in a
safe manner. This of course assumes you have a proper source for
quorum in your 2-node-setup with e.g. qdevice or using a shared
disk with sbd (not directly pacemaker quorum here but similar thing
handled inside sbd).


Since the alerts are issued from ‘hacluster’ login, sudo permissions for 
‘hacluster’ needs to be configured.

Thanx.


From: Klaus Wenninger [mailto:kwenn...@redhat.com]
Sent: Monday, July 24, 2017 9:24 PM
To: Kristián Feldsam; Cluster Labs - All topics related to open-source 
clustering welcomed
Subject: Re: [ClusterLabs] Two nodes cluster issue

On 07/24/2017 05:37 PM, Kristián Feldsam wrote:
I personally think that power off node by switched pdu is more safe, or not?

True if that is working in you environment. If you can't do a physical setup
where you aren't simultaneously loosing connection to both your node and
the switch-device (or you just want to cover cases where that happens)
you have to come up with something else.




S pozdravem Kristián Feldsam
Tel.: +420 773 303 353, +421 944 137 535
E-mail.: supp...@feldhost.cz<mailto:supp...@feldhost.cz>

www.feldhost.cz<http://www.feldhost.cz> - FeldHost™ – profesionální hostingové 
a serverové služby za adekvátní ceny.

FELDSAM s.r.o.
V rohu 434/3
Praha 4 – Libuš, PSČ 142 00
IČ: 290 60 958, DIČ: CZ290 60 958
C 200350 vedená u Městského soudu v Praze

Banka: Fio banka a.s.
Číslo účtu: 2400330446/2010
BIC: FIOBCZPPXX
IBAN: CZ82 2010  0024 0033 0446

On 24 Jul 2017, at 17:27, Klaus Wenninger 
<kwenn...@redhat.com<mailto:kwenn...@redhat.com>> wrote:

On 07/24/2017 05:15 PM, Tomer Azran wrote:
I still don't understand why the qdevice concept doesn't help on this 
situation. Since the master node is down, I would expect the quorum to declare 
it as dead.
Why doesn't it happens?

That is not how quorum works. It just limits the decision-making to the quorate 
subset of the cluster.
Still the unknown nodes are not sure to be down.
That is why I suggested to have quorum-based watchdog-fencing with sbd.
That would assure that within a certain time all nodes of the non-quorate part
of the cluster are down.






On Mon, Jul 24, 2017 at 4:15 PM +0300, "Dmitri Maziuk" 
<dmitri.maz...@gmail.com<mailto:dmitri.maz...@gmail.com>> wrote:

On 2017-07-24 07:51, Tomer Azran wrote:

> We don't have the ability to use it.

> Is that the only solution?



No, but I'd recommend thinking about it first. Are you sure you will

care about your cluster working when your server room is on fire? 'Cause

unless you have halon suppression, your server room is a complete

write-off anyway. (Think water from sprinklers hitting rich chunky volts

in the servers.)



Dima



___

Users mailing list: Users@clusterlabs.org<mailto:Users@clusterlabs.org>

http://lists.clusterlabs.org/mailman/listinfo/users



Project Home: http://www.clusterlabs.org<http://www.clusterlabs.org/>

Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf

Bugs: http://bugs.clusterlabs.org<http://bugs.clusterlabs.org/>





___

Users mailing list: Users@clusterlabs.org<mailto:Users@clusterlabs.org>

http://lists.clusterlabs.org/mailman/listinfo/users



Project Home: http://www.clust

[ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Tomer Azran
Hello,

We built a pacemaker cluster with 2 physical servers.
We configured DRBD in Master\Slave setup, a floating IP and file system mount 
in Active\Passive mode.
We configured two STONITH devices (fence_ipmilan), one for each server.

We are trying to simulate a situation when the Master server crushes with no 
power.
We pulled both of the PSU cables and the server becomes offline (UNCLEAN).
The resources that the Master use to hold are now in Started (UNCLEAN) state.
The state is unclean since the STONITH failed (the STONITH device is located on 
the server (Intel RMM4 - IPMI) - which uses the same power supply).

The problem is that now, the cluster does not releasing the resources that the 
Master holds, and the service goes down.

Is there any way to overcome this situation?
We tried to add a qdevice but got the same results.

We are using pacemaker 1.1.15 on CentOS 7.3

Thanks,
Tomer.
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Tomer Azran
We don't have the ability to use it.
Is that the only solution?

In addition, it will not cover a scenario that the server room is down (for 
example - fire or earthquake), the switch will go down as well.

From: Klaus Wenninger
Sent: Monday, July 24, 15:31
Subject: Re: [ClusterLabs] Two nodes cluster issue
To: Cluster Labs - All topics related to open-source clustering welcomed, 
Kristián Feldsam


On 07/24/2017 02:05 PM, Kristián Feldsam wrote:
Hello, you have to use second fencing device, for ex. APC Switched PDU.

https://wiki.clusterlabs.org/wiki/Configure_Multiple_Fencing_Devices_Using_pcs

Problem here seems to be that the fencing devices available are running from
the same power-supply as the node itself. So they are kind of useless to 
determine
weather the partner-node has no power or simply is no reachable via network.


S pozdravem Kristián Feldsam
Tel.: +420 773 303 353, +421 944 137 535
E-mail.: supp...@feldhost.cz<mailto:supp...@feldhost.cz>

www.feldhost.cz<http://www.feldhost.cz> - FeldHost™ – profesionální hostingové 
a serverové služby za adekvátní ceny.

FELDSAM s.r.o.
V rohu 434/3
Praha 4 – Libuš, PSČ 142 00
IČ: 290 60 958, DIČ: CZ290 60 958
C 200350 vedená u Městského soudu v Praze

Banka: Fio banka a.s.
Číslo účtu: 2400330446/2010
BIC: FIOBCZPPXX
IBAN: CZ82 2010  0024 0033 0446

On 24 Jul 2017, at 13:51, Tomer Azran 
<tomer.az...@edp.co.il><mailto:tomer.az...@edp.co.il> wrote:

Hello,

We built a pacemaker cluster with 2 physical servers.
We configured DRBD in Master\Slave setup, a floating IP and file system mount 
in Active\Passive mode.
We configured two STONITH devices (fence_ipmilan), one for each server.

We are trying to simulate a situation when the Master server crushes with no 
power.
We pulled both of the PSU cables and the server becomes offline (UNCLEAN).
The resources that the Master use to hold are now in Started (UNCLEAN) state.
The state is unclean since the STONITH failed (the STONITH device is located on 
the server (Intel RMM4 - IPMI) – which uses the same power supply).

The problem is that now, the cluster does not releasing the resources that the 
Master holds, and the service goes down.

Is there any way to overcome this situation?
We tried to add a qdevice but got the same results.

If you have already setup qdevice (using an additional node or so) you could use
quorum-based watchdog-fencing via SBD.


We are using pacemaker 1.1.15 on CentOS 7.3

Thanks,
Tomer.
___
Users mailing list: Users@clusterlabs.org<mailto:Users@clusterlabs.org>
http://lists.clusterlabs.org/mailman/<http://lists.clusterlabs.org/mailman/listinfo/users>listinfo<http://lists.clusterlabs.org/mailman/listinfo/users>/users<http://lists.clusterlabs.org/mailman/listinfo/users>

Project Home: http://www.clusterlabs.org<http://www.clusterlabs.org/>
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org<http://bugs.clusterlabs.org/>



___ Users mailing list: 
Users@clusterlabs.org<mailto:Users@clusterlabs.org> 
http://lists.clusterlabs.org/mailman/<http://lists.clusterlabs.org/mailman/listinfo/users>listinfo<http://lists.clusterlabs.org/mailman/listinfo/users>/users<http://lists.clusterlabs.org/mailman/listinfo/users>
 Project Home: http://www.clusterlabs.org Getting started: 
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: 
http://bugs.clusterlabs.org

-- Klaus Wenninger Senior Software Engineer, EMEA ENG Openstack Infrastructure 
Red Hat 
kwenning<mailto:kwenn...@redhat.com>@redhat.com<mailto:kwenn...@redhat.com>

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Two nodes cluster issue

2017-07-30 Thread Tomer Azran
Just updating that I added another level of fencing using watchdog-fencing.
With the quorum device and this combination works in case of power failure of 
both server and ipmi interface.
An important note is that the stonith-watchdog-timeout must be configured in 
order to work.
After reading the following great post: 
http://blog.clusterlabs.org/blog/2015/sbd-fun-and-profit , I choose the softdog 
watchdog since the I don't think ipmi watchdog will do no good in case the ipmi 
interface is down (If it is OK it will be used as a fencing method).

Just for documenting the solution (in case someone else needed that), the 
configuration I added is:
systemctl enable sbd
pcs property set no-quorum-policy=suicide
pcs property set stonith-watchdog-timeout=15
pcs quorum device add model net host=qdevice algorithm=lms

I just can't decide if the qdevice algorithm should be lms or ffsplit. I 
couldn't determine the difference between them and I'm not sure which one is 
the best when using two node cluster with qdevice and watchdog fencing.

Can anyone advise on that?


From: Klaus Wenninger [mailto:kwenn...@redhat.com]
Sent: Tuesday, July 25, 2017 2:19 AM
To: Tomer Azran <tomer.az...@edp.co.il>; Cluster Labs - All topics related to 
open-source clustering welcomed <users@clusterlabs.org>; Prasad, Shashank 
<sspra...@vanu.com>
Subject: Re: [ClusterLabs] Two nodes cluster issue

On 07/24/2017 11:59 PM, Tomer Azran wrote:
There is a problem with that – it seems like SBD with shared disk is disabled 
on CentOS 7.3:

When I run:
# sbd -d /dev/sbd create

I get:
Shared disk functionality not supported

Which is why I suggested to go for watchdog-fencing using
your qdevice setup.
As said I haven't tried with qdevice-quorum - but I don't
see a reason why that shouldn't work.
no-quorum-policy has to be suicide of course.



So I might try the software watchdog (softgod or ipmi_watchdog)

A reliable watchdog is really crucial for sbd so I would
recommend going for ipmi or anything else that has
hardware behind.

Klaus


Tomer.

From: Tomer Azran [mailto:tomer.az...@edp.co.il]
Sent: Tuesday, July 25, 2017 12:30 AM
To: kwenn...@redhat.com<mailto:kwenn...@redhat.com>; Cluster Labs - All topics 
related to open-source clustering welcomed 
<users@clusterlabs.org><mailto:users@clusterlabs.org>; Prasad, Shashank 
<sspra...@vanu.com><mailto:sspra...@vanu.com>
Subject: Re: [ClusterLabs] Two nodes cluster issue

I tend to agree with Klaus – I don't think that having a hook that bypass 
stonith is the right way. It is better to not use stonith at all.

That was of course with a certain degree of hyperbolism. Anything is of course 
better than not having
fencing at all.
I might be wrong but what you were saying somehow was drawing a picture in my 
mind that you
have your 2 nodes at 2 sites/rooms quite separated and in that case ...


I think I will try to use an iScsi target on my qdevice and set SBD to use it.
I still don't understand why qdevice can't take the place SBD with shared 
storage; correct me if I'm wrong, but it looks like both of them are there for 
the same reason.

sbd with watchdog + qdevice can take the place of sbd with shared storage.
qdevice is there to decide which part of a cluster is quorate and which not - 
in cases
where after a split this wouldn't be possible.
sbd (with watchdog) is then there to reliably take down the non-quorate part
within a well defined time.



From: Klaus Wenninger [mailto:kwenn...@redhat.com]
Sent: Monday, July 24, 2017 9:01 PM
To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org<mailto:users@clusterlabs.org>>; Prasad, Shashank 
<sspra...@vanu.com<mailto:sspra...@vanu.com>>
Subject: Re: [ClusterLabs] Two nodes cluster issue

On 07/24/2017 07:32 PM, Prasad, Shashank wrote:
Sometimes IPMI fence devices use shared power of the node, and it cannot be 
avoided.
In such scenarios the HA cluster is NOT able to handle the power failure of a 
node, since the power is shared with its own fence device.
The failure of IPMI based fencing can also exist due to other reasons also.

A failure to fence the failed node will cause cluster to be marked UNCLEAN.
To get over it, the following command needs to be invoked on the surviving node.

pcs stonith confirm  --force

This can be automated by hooking a recovery script, when the the Stonith 
resource ‘Timed Out’ event.
To be more specific, the Pacemaker Alerts can be used for watch for Stonith 
timeouts and failures.
In that script, all that’s essentially to be executed is the aforementioned 
command.

If I get you right here you can disable fencing then in the first place.
Actually quorum-based-watchdog-fencing is the way to do this in a
safe manner. This of course assumes you have a proper source for
quorum in your 2-node-setup with e.g. qdevice or using a shared
disk with sbd (not directly pacemaker quorum here but similar thing
handled

Re: [ClusterLabs] Antw: IPaddr2 RA and bonding

2017-08-07 Thread Tomer Azran
STONITH is enabled and working.

-Original Message-
From: Ulrich Windl [mailto:ulrich.wi...@rz.uni-regensburg.de] 
Sent: Monday, August 7, 2017 2:52 PM
To: users@clusterlabs.org
Subject: [ClusterLabs] Antw: IPaddr2 RA and bonding

>>> Tomer Azran <tomer.az...@edp.co.il> schrieb am 07.08.2017 um 12:02 
>>> in Nachricht
<B71057A42F7488498D6FAF859D9283FE2AC47F38@EDPEX02.globalit.local>:
> Hello All,
> 
> We are using CentOS 7.3 with pacemaker in order to create a cluster.
> Each cluster node ha a bonding interface consists of two nics.
> The cluster has an IPAddr2 resource configured like that:
> 
> # pcs resource show cluster_vip
> Resource: cluster_vip (class=ocf provider=heartbeat type=IPaddr2)
>   Attributes: ip=192.168.1.3
>   Operations: start interval=0s timeout=20s (cluster_vip -start-interval-0s)
>   stop interval=0s timeout=20s (cluster_vip -stop-interval-0s)
>   monitor interval=30s (cluster_vip -monitor-interval-30s)
> 
> 
> We are running tests and want to simulate a state when the network 
> links are down.
> We are pulling both network cables from the server.
> 
> The problem is that the resource is not marked as failed, and the 
> faulted node keep holding it and does not fail it over to the other node.
> I think that the problem is within the bond interface. The bond 
> interface is marked as UP on the OS. It even can ping itself:
> 
> # ip link show
> 2: eno3: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq 
> master
> bond1 state DOWN mode DEFAULT qlen 1000
> link/ether 00:1e:67:f6:5a:8a brd ff:ff:ff:ff:ff:ff
> 3: eno4: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq 
> master
> bond1 state DOWN mode DEFAULT qlen 1000
> link/ether 00:1e:67:f6:5a:8a brd ff:ff:ff:ff:ff:ff
> 9: bond1: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc 
> noqueue state DOWN mode DEFAULT qlen 1000
> link/ether 00:1e:67:f6:5a:8a brd ff:ff:ff:ff:ff:ff
> 
> As far as I understand the IPaddr2 RA does not check the link state of 
> the interface - What can be done?
> 
> BTW, I tried to find a solution on the bonding configuration which 
> disables the bond when no link is up, but I didn't find any.

Show the cliuster status, not the network status. My guess is that you haven't 
activated stonith.

Regards,
Ulrich


> 
> Tomer.





___
Users mailing list: Users@clusterlabs.org 
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org Getting started: 
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Two nodes cluster issue

2017-08-07 Thread Tomer Azran
I read the corosync-qdevice (8) man page couple of times, and also the RH 
documentation at 
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/High_Availability_Add-On_Reference/s1-quorumdev-HAAR.html
 
I think it will be great if you will be able to add some examples that 
demonstrate the difference between the two, and give some use cases that 
explain what is the preferred algorithm to use in each case. 

-Original Message-
From: Jan Friesse [mailto:jfrie...@redhat.com] 
Sent: Monday, August 7, 2017 2:38 PM
To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org>; kwenn...@redhat.com; Prasad, Shashank 
<sspra...@vanu.com>
Subject: Re: [ClusterLabs] Two nodes cluster issue

Tomer Azran napsal(a):
> Just updating that I added another level of fencing using watchdog-fencing.
> With the quorum device and this combination works in case of power failure of 
> both server and ipmi interface.
> An important note is that the stonith-watchdog-timeout must be configured in 
> order to work.
> After reading the following great post: 
> http://blog.clusterlabs.org/blog/2015/sbd-fun-and-profit , I choose the 
> softdog watchdog since the I don't think ipmi watchdog will do no good in 
> case the ipmi interface is down (If it is OK it will be used as a fencing 
> method).
>
> Just for documenting the solution (in case someone else needed that), the 
> configuration I added is:
> systemctl enable sbd
> pcs property set no-quorum-policy=suicide pcs property set 
> stonith-watchdog-timeout=15 pcs quorum device add model net 
> host=qdevice algorithm=lms
>
> I just can't decide if the qdevice algorithm should be lms or ffsplit. I 
> couldn't determine the difference between them and I'm not sure which one is 
> the best when using two node cluster with qdevice and watchdog fencing.
>
> Can anyone advise on that?

I'm pretty sure you've read corosync-qdevice (8) man page where is quite 
detailed description of algorithms so if you were not able to determine the 
difference them there is something wrong and man page needs improvement. What 
exactly you were unable to understand?

Also for your use case with 2 nodes both algorithms behaves same way.

Honza

>
> -Original Message-
> From: Jan Friesse [mailto:jfrie...@redhat.com]
> Sent: Tuesday, July 25, 2017 11:59 AM
> To: Cluster Labs - All topics related to open-source clustering 
> welcomed <mailto:users@clusterlabs.org>; mailto:kwenn...@redhat.com; Prasad, 
> Shashank <mailto:sspra...@vanu.com>
> Subject: Re: [ClusterLabs] Two nodes cluster issue
>
>> Tomer Azran napsal(a):
>>> I tend to agree with Klaus – I don't think that having a hook that 
>>> bypass stonith is the right way. It is better to not use stonith at all.
>>> I think I will try to use an iScsi target on my qdevice and set SBD 
>>> to use it.
>>> I still don't understand why qdevice can't take the place SBD with 
>>> shared storage; correct me if I'm wrong, but it looks like both of 
>>> them are there for the same reason.
>>
>> Qdevice is there to be third side arbiter who decides which partition 
>> is quorate. It can also be seen as a quorum only node. So for two 
>> node cluster it can be viewed as a third node (eventho it is quite 
>> special because it cannot run resources). It is not doing fencing.
>>
>> SBD is fencing device. It is using disk as a third side arbiter.
>
> I've talked with Klaus and he told me that 7.3 is not using disk as a third 
> side arbiter so sorry for confusion.
>
> You should however still be able to use sbd for checking if pacemaker is 
> alive and if the partition has quorum - otherwise the watchdog kills the 
> node. So qdevice will give you "3rd" node and sbd fences unquorate partition.
>
> Or (as mentioned previously) you can use fabric fencing.
>
> Regards,
> Honza
>
>>
>>
>>>
>>> From: Klaus Wenninger [mailto:kwenn...@redhat.com]
>>> Sent: Monday, July 24, 2017 9:01 PM
>>> To: Cluster Labs - All topics related to open-source clustering 
>>> welcomed <mailto:users@clusterlabs.org>; Prasad, Shashank 
>>> <mailto:sspra...@vanu.com>
>>> Subject: Re: [ClusterLabs] Two nodes cluster issue
>>>
>>> On 07/24/2017 07:32 PM, Prasad, Shashank wrote:
>>> Sometimes IPMI fence devices use shared power of the node, and it 
>>> cannot be avoided.
>>> In such scenarios the HA cluster is NOT able to handle the power 
>>> failure of a node, since the power is shared with its own fence device.
>>> The failure of IPMI based fencing can also exist due to other 
>>>

Re: [ClusterLabs] IPaddr2 RA and bonding

2017-08-07 Thread Tomer Azran
I don't want to use this approach since I don't want to be depend on pinging to 
other host or couple of hosts.
Is there any other solution?
I'm thinking of writing a simple script that will take a bond down using ifdown 
command when there are no slaves available and put it on /sbin/ifdown-local  


-Original Message-
From: Ken Gaillot [mailto:kgail...@redhat.com] 
Sent: Monday, August 7, 2017 7:14 PM
To: Cluster Labs - All topics related to open-source clustering welcomed 
<users@clusterlabs.org>
Subject: Re: [ClusterLabs] IPaddr2 RA and bonding

On Mon, 2017-08-07 at 10:02 +, Tomer Azran wrote:
> Hello All,
> 
>  
> 
> We are using CentOS 7.3 with pacemaker in order to create a cluster.
> 
> Each cluster node ha a bonding interface consists of two nics.
> 
> The cluster has an IPAddr2 resource configured like that:
> 
>  
> 
> # pcs resource show cluster_vip
> 
> Resource: cluster_vip (class=ocf provider=heartbeat type=IPaddr2)
> 
>   Attributes: ip=192.168.1.3
> 
>   Operations: start interval=0s timeout=20s (cluster_vip
> -start-interval-0s)
> 
>   stop interval=0s timeout=20s (cluster_vip
> -stop-interval-0s)
> 
>   monitor interval=30s (cluster_vip -monitor-interval-30s)
> 
>  
> 
>  
> 
> We are running tests and want to simulate a state when the network 
> links are down.
> 
> We are pulling both network cables from the server.
> 
>  
> 
> The problem is that the resource is not marked as failed, and the 
> faulted node keep holding it and does not fail it over to the other 
> node.
> 
> I think that the problem is within the bond interface. The bond 
> interface is marked as UP on the OS. It even can ping itself:
> 
>  
> 
> # ip link show
> 
> 2: eno3: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq 
> master bond1 state DOWN mode DEFAULT qlen 1000
> 
> link/ether 00:1e:67:f6:5a:8a brd ff:ff:ff:ff:ff:ff
> 
> 3: eno4: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq 
> master bond1 state DOWN mode DEFAULT qlen 1000
> 
> link/ether 00:1e:67:f6:5a:8a brd ff:ff:ff:ff:ff:ff
> 
> 9: bond1: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc 
> noqueue state DOWN mode DEFAULT qlen 1000
> 
> link/ether 00:1e:67:f6:5a:8a brd ff:ff:ff:ff:ff:ff
> 
>  
> 
> As far as I understand the IPaddr2 RA does not check the link state of 
> the interface – What can be done?

You are correct. The IP address itself *is* up, even if the link is down, and 
it can be used locally on that host.

If you want to monitor connectivity to other hosts, you have to do that 
separately. The most common approach is to use the ocf:pacemaker:ping resource. 
See:

http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_moving_resources_due_to_connectivity_changes
 
> BTW, I tried to find a solution on the bonding configuration which 
> disables the bond when no link is up, but I didn't find any.
> 
>  
> 
> Tomer.
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org Getting started: 
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

--
Ken Gaillot <kgail...@redhat.com>





___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] HAProxy resource agent

2018-04-13 Thread Tomer Azran
Hello,

I'm planning to install an active\active HAProxy cluster on CentOS 7
I didn't found that there is any RA for HAproxy.
I found some on the net but I'm not sure if I need it. For example: 
https://raw.githubusercontent.com/thisismitch/cluster-agents/master/haproxy
I can always use the systemd service RA

What is your recommendation?

Thanks,
Tomer.
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] IPaddr2 RA and multicast mac

2019-09-03 Thread Tomer Azran
Hello,

When using IPaddr2 RA in order to set a cloned IP address resource:

pcs resource create vip1 ocf:heartbeat:IPaddr2 ip=10.0.0.100 iflabel=vip1 
cidr_netmask=24 flush_routes=true op monitor interval=30s
pcs resource clone vip1 clone-max=2 clone-node-max=2 globally-unique=true

Then the cluster set the iptables CLUSTERIP module, and the result is something 
like that:

# iptables -L -n
.
.
.
CLUSTERIP  all  --  0.0.0.0/010.0.0.100 CLUSTERIP 
hashmode=sourceip-sourceport clustermac=A1:DE:DE:89:A6:FE total_nodes=2 
local_node=1 hash_init=0
.
.
.

The problem is that the RA picks a clustermac address which is not on the 
multicast range (must start with 01:00:5E)
If not working with a multicast address, the traffic is being treated as 
broadcast which is bad.

I found that you can set a multicast mac if you use the "mac" parameter, which 
solves the issue.

Can the RA default be changed to use multicast range?
In addition, I think that you might need to update the documentation 
(https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/_clone_the_ip_address.html)
 and instruct users to use the mac parameter when creating the resource. In 
addition, I think that the documentation should instruct the user to enable 
multicast traffic on the network, which is not enabled by default.

Tomer Azran
IDM & LINUX Professional Services

tomer.az...@edp.co.il<mailto:tomer.az...@edp.co.il>
m: +972-52-6389961
t: +972-3-6438222
f: +972-3-6438004

[http://www.edp.co.il/logo1-small.png]<http://www.edp.co.il/>
www.edp.co.il<http://www.edp.co.il/>

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] IPAddr2 RA and CLUSTERIP local_node

2019-09-03 Thread Tomer Azran
Hello,

When using IPaddr2 RA in order to set a cloned IP address resource:

pcs resource create vip1 ocf:heartbeat:IPaddr2 ip=10.0.0.100 iflabel=vip1 
cidr_netmask=24 flush_routes=true op monitor interval=30s
pcs resource clone vip1 clone-max=2 clone-node-max=2 globally-unique=true

Then the cluster set the iptables CLUSTERIP module, and the result is something 
like that:

# iptables -L -n
.
.
.
CLUSTERIP  all  --  0.0.0.0/010.0.0.100 CLUSTERIP 
hashmode=sourceip-sourceport clustermac=A1:DE:DE:89:A6:FE total_nodes=2 
local_node=2 hash_init=0
.
.
.

The problem is that on both nodes, I can see that the local_node value on the 
CLUSTERIP is the same ("2")
I looked on the RA source code, at 
https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/IPaddr2,  
and found that the local-node parameter is set to this value:

IP_INC_NO=`expr ${OCF_RESKEY_CRM_meta_clone:-0} + 1`

Can you think of a reason why my RA always set the local_node to "2"?


Tomer Azran
IDM & LINUX Professional Services

tomer.az...@edp.co.il
m: +972-52-6389961
t: +972-3-6438222
f: +972-3-6438004 

www.edp.co.il

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/