Re: [ClusterLabs] pcs create master/slave resource doesn't work (Ken Gaillot)

2017-11-30 Thread Hui Xiang
Hi all,

  I am using the ovndb-servers ocf agent[1] which is a kind of multi-state
resource,when I am creating it(please see my previous email), the monitor
is called only once, and the start operation is never called, according to
below description, the once called monitor operation returned
OCF_NOT_RUNNING, should the pacemaker will decide to execute start action
based this return code? is there any way to check out what is the next
action? Currently in my environment nothing happened and I am almost tried
all I known ways to debug, however, no lucky, could anyone help it out?
thank you very much.

Monitor Return CodeDescription
OCF_NOT_RUNNING Stopped
OCF_SUCCESS Running (Slave)
OCF_RUNNING_MASTER Running (Master)
OCF_FAILED_MASTER Failed (Master)
Other Failed (Slave)


[1]
https://github.com/openvswitch/ovs/blob/master/ovn/utilities/ovndb-servers.ocf
Hui.



On Thu, Nov 30, 2017 at 6:39 PM, Hui Xiang  wrote:

> The really weired thing is that the monitor is only called once other than
> expected repeatedly, where should I check for it?
>
> On Thu, Nov 30, 2017 at 4:14 PM, Hui Xiang  wrote:
>
>> Thanks Ken very much for your helpful infomation.
>>
>> I am now blocking on I can't see the pacemaker DC do any further
>> start/promote etc action on my resource agents, no helpful logs founded.
>>
>> So my first question is that in what kind of situation DC will decide do
>> call start action?  does the monitor operation need to be return
>> OCF_SUCCESS? in my case, it will return OCF_NOT_RUNNING, and the monitor
>> operation is not being called any more, which should be wrong as I felt
>> that it should be called intervally.
>>
>> The resource agent monitor logistic:
>> In the xx_monitor function it will call xx_update, and there always hit
>>  "$CRM_MASTER -D;;" , what does it usually mean? will it stopped that
>> start operation being called?
>>
>> ovsdb_server_master_update() {
>> ocf_log info "ovsdb_server_master_update: $1}"
>>
>> case $1 in
>> $OCF_SUCCESS)
>> $CRM_MASTER -v ${slave_score};;
>> $OCF_RUNNING_MASTER)
>> $CRM_MASTER -v ${master_score};;
>> #*) $CRM_MASTER -D;;
>> esac
>> ocf_log info "ovsdb_server_master_update end}"
>> }
>>
>> ovsdb_server_monitor() {
>> ocf_log info "ovsdb_server_monitor"
>> ovsdb_server_check_status
>> rc=$?
>>
>> ovsdb_server_master_update $rc
>> ocf_log info "monitor is going to return $rc"
>> return $rc
>> }
>>
>>
>> Below is my cluster configuration:
>>
>> 1. First I have an vip set.
>> [root@node-1 ~]# pcs resource show
>>  vip__management_old (ocf::es:ns_IPaddr2): Started node-1.domain.tld
>>
>> 2. Use pcs to create ovndb-servers and constraint
>> [root@node-1 ~]# pcs resource create tst-ovndb ocf:ovn:ovndb-servers
>> manage_northd=yes master_ip=192.168.0.2 nb_master_port=6641
>> sb_master_port=6642 master
>>  ([root@node-1 ~]# pcs resource meta tst-ovndb-master notify=true
>>   Error: unable to find a resource/clone/master/group:
>> tst-ovndb-master) ## returned error, so I changed into below command.
>> [root@node-1 ~]# pcs resource master tst-ovndb-master tst-ovndb
>> notify=true
>> [root@node-1 ~]# pcs constraint colocation add master tst-ovndb-master
>> with vip__management_old
>>
>> 3. pcs status
>> [root@node-1 ~]# pcs status
>>  vip__management_old (ocf::es:ns_IPaddr2): Started node-1.domain.tld
>>  Master/Slave Set: tst-ovndb-master [tst-ovndb]
>>  Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
>>
>> 4. pcs resource show XXX
>> [root@node-1 ~]# pcs resource show  vip__management_old
>>  Resource: vip__management_old (class=ocf provider=es type=ns_IPaddr2)
>>   Attributes: nic=br-mgmt base_veth=br-mgmt-hapr ns_veth=hapr-m
>> ip=192.168.0.2 iflabel=ka cidr_netmask=24 ns=haproxy gateway=none
>> gateway_metric=0 iptables_start_rules=false iptables_stop_rules=false
>> iptables_comment=default-comment
>>   Meta Attrs: migration-threshold=3 failure-timeout=60
>> resource-stickiness=1
>>   Operations: monitor interval=3 timeout=30 (vip__management_old-monitor-3
>> )
>>   start interval=0 timeout=30 (vip__management_old-start-0)
>>   stop interval=0 timeout=30 (vip__management_old-stop-0)
>> [root@node-1 ~]# pcs resource show tst-ovndb-master
>>  Master: tst-ovndb-master
>>   Meta Attrs: notify=true
>>   Resource: tst-ovndb (class=ocf provider=ovn type=ovndb-servers)
>>Attributes: manage_northd=yes master_ip=192.168.0.2
>> nb_master_port=6641 sb_master_port=6642
>>Operations: start interval=0s timeout=30s (tst-ovndb-start-timeout-30s)
>>stop interval=0s timeout=20s (tst-ovndb-stop-timeout-20s)
>>promote interval=0s timeout=50s
>> (tst-ovndb-promote-timeout-50s)
>>demote interval=0s timeout=50s
>> (tst-ovndb-demote-timeout-50s)
>>monitor interval=30s timeout=20s
>> (tst-ovndb-monitor-interval-30s)
>>  

Re: [ClusterLabs] Antw: Re: pacemaker with sbd fails to start if node reboots too fast.

2017-11-30 Thread Andrei Borzenkov
30.11.2017 16:11, Klaus Wenninger пишет:
> On 11/30/2017 01:41 PM, Ulrich Windl wrote:
>>
> "Gao,Yan"  schrieb am 30.11.2017 um 11:48 in Nachricht
>> :
>>> On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:
 SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
 VM on VSphere using shared VMDK as SBD. During basic tests by killing
 corosync and forcing STONITH pacemaker was not started after reboot.
 In logs I see during boot

 Nov 22 16:04:56 sapprod01s crmd[3151]: crit: We were allegedly
 just fenced by sapprod01p for sapprod01p
 Nov 22 16:04:56 sapprod01s pacemakerd[3137]:  warning: The crmd
 process (3151) can no longer be respawned,
 Nov 22 16:04:56 sapprod01s pacemakerd[3137]:   notice: Shutting down 
>>> Pacemaker
 SBD timeouts are 60s for watchdog and 120s for msgwait. It seems that
 stonith with SBD always takes msgwait (at least, visually host is not
 declared as OFFLINE until 120s passed). But VM rebots lightning fast
 and is up and running long before timeout expires.
>> As msgwait was intended for the message to arrive, and not for the reboot 
>> time (I guess), this just shows a fundamental problem in SBD design: Receipt 
>> of the fencing command is not confirmed (other than by seeing the 
>> consequences of ist execution).
> 
> The 2 x msgwait is not for confirmations but for writing the poison-pill
> and for
> having it read by the target-side.

Yes, of course, but that's not what Urlich likely intended to say.
msgwait must account for worst case storage path latency, while in
normal cases it happens much faster. If fenced node could acknowledge
having been killed after reboot, stonith agent could return success much
earlier.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Adding Tomcat as a resource to a Cluster on CentOS 7

2017-11-30 Thread Oyvind Albrigtsen

Tomcat can be very slow at startup depending on the modules you use,
so you can either disable modules you arent using to make it start
faster or set a higher start timeout via "pcs resource  op
start interval=".

On 30/11/17 13:26 +, Sean Beeson wrote:

Hi, list.

This is a pretty basic question. I have gone through what I could find on
setting up Tomcat service as a resource to a cluster, but did not find
exactly the issue I am having. Sorry, if it has been covered before.

I am attempting this on centos-release-7-4.1708.el7.centos.x86_64.
The pcs I have installed is pcs-0.9.158-6.el7.centos.x86_64
The resource-agents installed is resource-agents-3.9.5-105.el7_4.2.x86_64

I have DRBD, MySql, and a virtual IP running spectacularly well and they
failover perfectly and do exactly what I want them. I can add Tomcat as a
resource just fine, but it never starts and I can not fined anything in any
log file that indicates why. Pcs does at some point know to check on it,
but simply says Tomcat is not running. If I run everything manually on in a
cluster I can manually get Tomcat to start with systemctl. Here is how I am
try to configure it.

[root@centos7-ha-lab-01 ~]# pcs status
Cluster name: ha-cluster
Stack: corosync
Current DC: centos7-ha-lab-02-cr (version 1.1.16-12.el7_4.4-94ff4df) -
partition with quorum
Last updated: Thu Nov 30 21:03:36 2017
Last change: Thu Nov 30 20:53:37 2017 by root via cibadmin on
centos7-ha-lab-01-cr

2 nodes configured
6 resources configured

Online: [ centos7-ha-lab-01-cr centos7-ha-lab-02-cr ]

Full list of resources:

Master/Slave Set: DRBD_data_clone [DRBD_data]
Masters: [ centos7-ha-lab-01-cr ]
Slaves: [ centos7-ha-lab-02-cr ]
fsDRBD_data(ocf::heartbeat:Filesystem):Started centos7-ha-lab-01-cr
OuterDB_Service(systemd:mysqld):Started centos7-ha-lab-01-cr
OuterDB_VIP(ocf::heartbeat:IPaddr2):Started centos7-ha-lab-01-cr
tomcat_OuterWeb(ocf::heartbeat:tomcat):Stopped

Failed Actions:
* tomcat_OuterWeb_start_0 on centos7-ha-lab-01-cr 'unknown error' (1):
call=67, status=Timed Out, exitreason='none',
   last-rc-change='Thu Nov 30 20:56:22 2017', queued=0ms, exec=180003ms
* tomcat_OuterWeb_start_0 on centos7-ha-lab-02-cr 'unknown error' (1):
call=57, status=Timed Out, exitreason='none',
   last-rc-change='Thu Nov 30 20:53:23 2017', queued=0ms, exec=180003ms

Daemon Status:
 corosync: active/enabled
 pacemaker: active/enabled
 pcsd: active/enabled

I have tried with and without tomcat_name=tomcat_OuterWeb and tomcat and
root for tomcat_user=. Neither work.

Here is the command I am using to add it.

pcs resource create tomcat_OuterWeb ocf:heartbeat:tomcat
java_home="/opt/java/jre1.7.0_80" catalina_home="/opt/tomcat7"
catalina_opts="-Dbuild.compiler.emacs=true -Dfile.encoding=UTF-8
-Djava.util.logging.config.file=/opt/tomcat7/conf/log4j.properties
-Dlog4j.configuration=file:/opt/tomcat7/conf/log4j.properties -Xms1024m
-Xmx1024m -XX:PermSize=256m -XX:MaxPermSize=512m" tomcat_user="root" op
monitor interval="15s" op start timeout="180s"

I have tried also the most basic.
pcs resource create tomcat_OuterWeb ocf:heartbeat:tomcat
java_home="/opt/java/jre1.7.0_80" catalina_home="/opt/tomcat7"
tomcat_name="tomcat_OuterWeb" tomcat_user="root" op monitor interval="15s"
op start timeout="180s"

I other examples I have seen they usually use params then the options in
these command to add Tomcat as a resource, but when  I use that it tells me
that is an unrecognized option and it then accepts the options without it
just fine. I was led to think this was a difference in version of the
resource-agents perhaps.

Any idea why I can not get Tomcat to start or some lead to the logging I
could look at to understand why it is failing would be great. Nothing shows
in messages, catalina.out, pcsd.log, nor the resource
log--tomcat_OuterWeb.log. However, it does make the resource log, but it
only has this in it, which seems to be false:

2017/11/30 20:50:22: start ===
2017/11/30 20:53:22: stop  ###
2017/11/30 20:56:22: start ===
2017/11/30 20:59:22: stop  ###

The only other thing is: * tomcat_OuterWeb_start_0 on centos7-ha-lab-01-cr
'unknown error' (1): call=67, status=Timed Out, exitreason='none',

Again, any ideas would be appreciated. Thank you.

Kind regards,

Sean



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Adding Tomcat as a resource to a Cluster on CentOS 7

2017-11-30 Thread Sean Beeson
Hi, list.

This is a pretty basic question. I have gone through what I could find on
setting up Tomcat service as a resource to a cluster, but did not find
exactly the issue I am having. Sorry, if it has been covered before.

I am attempting this on centos-release-7-4.1708.el7.centos.x86_64.
The pcs I have installed is pcs-0.9.158-6.el7.centos.x86_64
The resource-agents installed is resource-agents-3.9.5-105.el7_4.2.x86_64

I have DRBD, MySql, and a virtual IP running spectacularly well and they
failover perfectly and do exactly what I want them. I can add Tomcat as a
resource just fine, but it never starts and I can not fined anything in any
log file that indicates why. Pcs does at some point know to check on it,
but simply says Tomcat is not running. If I run everything manually on in a
cluster I can manually get Tomcat to start with systemctl. Here is how I am
try to configure it.

[root@centos7-ha-lab-01 ~]# pcs status
Cluster name: ha-cluster
Stack: corosync
Current DC: centos7-ha-lab-02-cr (version 1.1.16-12.el7_4.4-94ff4df) -
partition with quorum
Last updated: Thu Nov 30 21:03:36 2017
Last change: Thu Nov 30 20:53:37 2017 by root via cibadmin on
centos7-ha-lab-01-cr

2 nodes configured
6 resources configured

Online: [ centos7-ha-lab-01-cr centos7-ha-lab-02-cr ]

Full list of resources:

 Master/Slave Set: DRBD_data_clone [DRBD_data]
 Masters: [ centos7-ha-lab-01-cr ]
 Slaves: [ centos7-ha-lab-02-cr ]
 fsDRBD_data(ocf::heartbeat:Filesystem):Started centos7-ha-lab-01-cr
 OuterDB_Service(systemd:mysqld):Started centos7-ha-lab-01-cr
 OuterDB_VIP(ocf::heartbeat:IPaddr2):Started centos7-ha-lab-01-cr
 tomcat_OuterWeb(ocf::heartbeat:tomcat):Stopped

Failed Actions:
* tomcat_OuterWeb_start_0 on centos7-ha-lab-01-cr 'unknown error' (1):
call=67, status=Timed Out, exitreason='none',
last-rc-change='Thu Nov 30 20:56:22 2017', queued=0ms, exec=180003ms
* tomcat_OuterWeb_start_0 on centos7-ha-lab-02-cr 'unknown error' (1):
call=57, status=Timed Out, exitreason='none',
last-rc-change='Thu Nov 30 20:53:23 2017', queued=0ms, exec=180003ms

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

I have tried with and without tomcat_name=tomcat_OuterWeb and tomcat and
root for tomcat_user=. Neither work.

Here is the command I am using to add it.

pcs resource create tomcat_OuterWeb ocf:heartbeat:tomcat
java_home="/opt/java/jre1.7.0_80" catalina_home="/opt/tomcat7"
catalina_opts="-Dbuild.compiler.emacs=true -Dfile.encoding=UTF-8
-Djava.util.logging.config.file=/opt/tomcat7/conf/log4j.properties
-Dlog4j.configuration=file:/opt/tomcat7/conf/log4j.properties -Xms1024m
-Xmx1024m -XX:PermSize=256m -XX:MaxPermSize=512m" tomcat_user="root" op
monitor interval="15s" op start timeout="180s"

I have tried also the most basic.
pcs resource create tomcat_OuterWeb ocf:heartbeat:tomcat
java_home="/opt/java/jre1.7.0_80" catalina_home="/opt/tomcat7"
tomcat_name="tomcat_OuterWeb" tomcat_user="root" op monitor interval="15s"
op start timeout="180s"

I other examples I have seen they usually use params then the options in
these command to add Tomcat as a resource, but when  I use that it tells me
that is an unrecognized option and it then accepts the options without it
just fine. I was led to think this was a difference in version of the
resource-agents perhaps.

Any idea why I can not get Tomcat to start or some lead to the logging I
could look at to understand why it is failing would be great. Nothing shows
in messages, catalina.out, pcsd.log, nor the resource
log--tomcat_OuterWeb.log. However, it does make the resource log, but it
only has this in it, which seems to be false:

2017/11/30 20:50:22: start ===
2017/11/30 20:53:22: stop  ###
2017/11/30 20:56:22: start ===
2017/11/30 20:59:22: stop  ###

The only other thing is: * tomcat_OuterWeb_start_0 on centos7-ha-lab-01-cr
'unknown error' (1): call=67, status=Timed Out, exitreason='none',

Again, any ideas would be appreciated. Thank you.

Kind regards,

Sean
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: pacemaker with sbd fails to start if node reboots too fast.

2017-11-30 Thread Klaus Wenninger
On 11/30/2017 01:41 PM, Ulrich Windl wrote:
>
 "Gao,Yan"  schrieb am 30.11.2017 um 11:48 in Nachricht
> :
>> On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:
>>> SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
>>> VM on VSphere using shared VMDK as SBD. During basic tests by killing
>>> corosync and forcing STONITH pacemaker was not started after reboot.
>>> In logs I see during boot
>>>
>>> Nov 22 16:04:56 sapprod01s crmd[3151]: crit: We were allegedly
>>> just fenced by sapprod01p for sapprod01p
>>> Nov 22 16:04:56 sapprod01s pacemakerd[3137]:  warning: The crmd
>>> process (3151) can no longer be respawned,
>>> Nov 22 16:04:56 sapprod01s pacemakerd[3137]:   notice: Shutting down 
>> Pacemaker
>>> SBD timeouts are 60s for watchdog and 120s for msgwait. It seems that
>>> stonith with SBD always takes msgwait (at least, visually host is not
>>> declared as OFFLINE until 120s passed). But VM rebots lightning fast
>>> and is up and running long before timeout expires.
> As msgwait was intended for the message to arrive, and not for the reboot 
> time (I guess), this just shows a fundamental problem in SBD design: Receipt 
> of the fencing command is not confirmed (other than by seeing the 
> consequences of ist execution).

The 2 x msgwait is not for confirmations but for writing the poison-pill
and for
having it read by the target-side.
Thus it is assumed that within a single msgwait the data is written and
confirmed.
And if the target-side doesn't manage to do the read within that time it
will
suicide via watchdog.
Thus a working watchdog is a fundamental precondition for sbd to work
properly
and storage-solutions that are doing caching, replication and stuff without
proper syncing are just not suitable for sbd.

Regards,
Klaus

>
> So the fencing node will see the other host is down (on the network), but it 
> won't believe it until SBD msgwait is over. OTOH if your msgwait is very low, 
> and the storage has a problem (exceeding msgwait), the node will assume a 
> successful fencing when in fact it didn't complete.
>
> So maybe there should be two timeouts: One for the command to be delivered 
> (without needing a confirmation, but the confirmation could shorten the 
> wait), and another for executing the command (how long will it take from 
> receipt of the command until the host is definitely down). Again a 
> confirmation could stop waiting before the timeout is reached.
>
> Regards,
> Ulrich
>
>
>>> I think I have seen similar report already. Is it something that can
>>> be fixed by SBD/pacemaker tuning?
>> SBD_DELAY_START=yes in /etc/sysconfig/sbd is the solution.
>>
>> Regards,
>>Yan
>>
>>> I can provide full logs tomorrow if needed.
>>>
>>> TIA
>>>
>>> -andrei
>>>
>>> ___
>>> Users mailing list: Users@clusterlabs.org 
>>> http://lists.clusterlabs.org/mailman/listinfo/users 
>>>
>>> Project Home: http://www.clusterlabs.org 
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>> Bugs: http://bugs.clusterlabs.org 
>>>
>>>
>>


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: pacemaker with sbd fails to start if node reboots too fast.

2017-11-30 Thread Ulrich Windl


>>> "Gao,Yan"  schrieb am 30.11.2017 um 11:48 in Nachricht
:
> On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:
>> SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
>> VM on VSphere using shared VMDK as SBD. During basic tests by killing
>> corosync and forcing STONITH pacemaker was not started after reboot.
>> In logs I see during boot
>> 
>> Nov 22 16:04:56 sapprod01s crmd[3151]: crit: We were allegedly
>> just fenced by sapprod01p for sapprod01p
>> Nov 22 16:04:56 sapprod01s pacemakerd[3137]:  warning: The crmd
>> process (3151) can no longer be respawned,
>> Nov 22 16:04:56 sapprod01s pacemakerd[3137]:   notice: Shutting down 
> Pacemaker
>> 
>> SBD timeouts are 60s for watchdog and 120s for msgwait. It seems that
>> stonith with SBD always takes msgwait (at least, visually host is not
>> declared as OFFLINE until 120s passed). But VM rebots lightning fast
>> and is up and running long before timeout expires.

As msgwait was intended for the message to arrive, and not for the reboot time 
(I guess), this just shows a fundamental problem in SBD design: Receipt of the 
fencing command is not confirmed (other than by seeing the consequences of ist 
execution).

So the fencing node will see the other host is down (on the network), but it 
won't believe it until SBD msgwait is over. OTOH if your msgwait is very low, 
and the storage has a problem (exceeding msgwait), the node will assume a 
successful fencing when in fact it didn't complete.

So maybe there should be two timeouts: One for the command to be delivered 
(without needing a confirmation, but the confirmation could shorten the wait), 
and another for executing the command (how long will it take from receipt of 
the command until the host is definitely down). Again a confirmation could stop 
waiting before the timeout is reached.

Regards,
Ulrich


>> 
>> I think I have seen similar report already. Is it something that can
>> be fixed by SBD/pacemaker tuning?
> SBD_DELAY_START=yes in /etc/sysconfig/sbd is the solution.
> 
> Regards,
>Yan
> 
>> 
>> I can provide full logs tomorrow if needed.
>> 
>> TIA
>> 
>> -andrei
>> 
>> ___
>> Users mailing list: Users@clusterlabs.org 
>> http://lists.clusterlabs.org/mailman/listinfo/users 
>> 
>> Project Home: http://www.clusterlabs.org 
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> Bugs: http://bugs.clusterlabs.org 
>> 
>> 
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: pacemaker self stonith

2017-11-30 Thread Ulrich Windl
Hi!

I have two suggestions:
1) sbd can do that (if sbd works in your environment).
2) If there is a network problem _and_ the client can detect it, you don't
need to kill the node; just try to migrate the affected resources. If that
fails, the other node(s) will take care of fencing.

Regards,
Ulrich


>>> Hauke Homburg  schrieb am 30.11.2017 um 11:41 in
Nachricht :
> Hallo List,
> 
> I am searching für a possibility to stonith a pacemaker node himself.
> 
> The Reason is ich need to check of the pacemaker noch can reach the
> network outside the local network. Because of network outage.
> 
> I can't connect to an ILO interface or so.
> 
> I consider a bash script:
> 
> if [ ! print -c 1 8.8.8.8] then $Hardware_reset ; fi
> 
> this in the crontab and run every minute.
> 
> Thanks für Help
> 
> Hauke
> 
> 
> 
> -- 
> www.w3-creative.de 
> 
> www.westchat.de 
> 
> https://friendica.westchat.de/profile/hauke 
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] questions about startup fencing

2017-11-30 Thread Adam Spiers

Ken Gaillot  wrote:

On Wed, 2017-11-29 at 14:22 +, Adam Spiers wrote:

Hi all,

A colleague has been valiantly trying to help me belatedly learn
about
the intricacies of startup fencing, but I'm still not fully
understanding some of the finer points of the behaviour.

The documentation on the "startup-fencing" option[0] says

Advanced Use Only: Should the cluster shoot unseen nodes? Not
using the default is very unsafe!

and that it defaults to TRUE, but doesn't elaborate any further:

https://clusterlabs.org/doc/en-US/Pacemaker/1.1-crmsh/html/Pacema
ker_Explained/s-cluster-options.html

Let's imagine the following scenario:

- We have a 5-node cluster, with all nodes running cleanly.

- The whole cluster is shut down cleanly.

- The whole cluster is then started up again.  (Side question: what
  happens if the last node to shut down is not the first to start up?
  How will the cluster ensure it has the most recent version of the
  CIB?  Without that, how would it know whether the last man standing
  was shut down cleanly or not?)


Of course, the cluster can't know what CIB version nodes it doesn't see
have, so if a set of nodes is started with an older version, it will go
with that.


Right, that's what I expected.


However, a node can't do much without quorum, so it would be difficult
to get in a situation where CIB changes were made with quorum before
shutdown, but none of those nodes are present at the next start-up with
quorum.

In any case, when a new node joins a cluster, the nodes do compare CIB
versions. If the new node has a newer CIB, the cluster will use it. If
other changes have been made since then, the newest CIB wins, so one or
the other's changes will be lost.


Ahh, that's interesting.  Based on reading

   
https://clusterlabs.org/doc/en-US/Pacemaker/1.1-crmsh/html/Pacemaker_Explained/ch03.html#_cib_properties

whichever node has the highest (admin_epoch, epoch, num_updates) tuple
will win, so normally in this scenario it would be the epoch which
decides it, i.e. whichever node had the most changes since the last
time the conflicting nodes shared the same config - right?

And if that would choose the wrong node, admin_epoch can be set
manually to override that decision?


Whether missing nodes were shut down cleanly or not relates to your
next question ...


- 4 of the nodes boot up fine and rejoin the cluster within the
  dc-deadtime interval, foruming a quorum, but the 5th doesn't.

IIUC, with startup-fencing enabled, this will result in that 5th node
automatically being fenced.  If I'm right, is that really *always*
necessary?


It's always safe. :-) As you mentioned, if the missing node was the
last one alive in the previous run, the cluster can't know whether it
shut down cleanly or not. Even if the node was known to shut down
cleanly in the last run, the cluster still can't know whether the node
was started since then and is now merely unreachable. So, fencing is
necessary to ensure it's not accessing resources.


I get that, but I was questioning the "necessary to ensure it's not
accessing resources" part of this statement.  My point is that
sometimes this might be overkill, because sometimes we might be able to
discern through other methods that there are no resources we need to
worry about potentially conflicting with what we want to run.  That's
why I gave the stateless clones example.


The same scenario is why a single node can't have quorum at start-up in
a cluster with "two_node" set. Both nodes have to see each other at
least once before they can assume it's safe to do anything.


Yep.


Let's suppose further that the cluster configuration is such that no
stateful resources which could potentially conflict with other nodes
will ever get launched on that 5th node.  For example it might only
host stateless clones, or resources with require=nothing set, or it
might not even host any resources at all due to some temporary
constraints which have been applied.

In those cases, what is to be gained from fencing?  The only thing I
can think of is that using (say) IPMI to power-cycle the node *might*
fix whatever issue was preventing it from joining the cluster.  Are
there any other reasons for fencing in this case?  It wouldn't help
avoid any data corruption, at least.


Just because constraints are telling the node it can't run a resource
doesn't mean the node isn't malfunctioning and running it anyway. If
the node can't tell us it's OK, we have to assume it's not.


Sure, but even if it *is* running it, if it's not conflicting with
anything or doing any harm, is it really always better to fence
regardless?

Disclaimer: to a certain extent I'm playing devil's advocate here to
stimulate a closer (re-)examination of the axiom we've grown so used
to over the years that if we don't know what a node is doing, we
should fence it.  I'm not necessarily arguing that fencing is wrong
here, but I think it's healthy to occasionally go back to first

Re: [ClusterLabs] pacemaker with sbd fails to start if node reboots too fast.

2017-11-30 Thread Andrei Borzenkov
On Thu, Nov 30, 2017 at 1:48 PM, Gao,Yan  wrote:
> On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:
>>
>> SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
>> VM on VSphere using shared VMDK as SBD. During basic tests by killing
>> corosync and forcing STONITH pacemaker was not started after reboot.
>> In logs I see during boot
>>
>> Nov 22 16:04:56 sapprod01s crmd[3151]: crit: We were allegedly
>> just fenced by sapprod01p for sapprod01p
>> Nov 22 16:04:56 sapprod01s pacemakerd[3137]:  warning: The crmd
>> process (3151) can no longer be respawned,
>> Nov 22 16:04:56 sapprod01s pacemakerd[3137]:   notice: Shutting down
>> Pacemaker
>>
>> SBD timeouts are 60s for watchdog and 120s for msgwait. It seems that
>> stonith with SBD always takes msgwait (at least, visually host is not
>> declared as OFFLINE until 120s passed). But VM rebots lightning fast
>> and is up and running long before timeout expires.
>>
>> I think I have seen similar report already. Is it something that can
>> be fixed by SBD/pacemaker tuning?
>
> SBD_DELAY_START=yes in /etc/sysconfig/sbd is the solution.
>

Sounds promising. Is it enough? Comment in /etc/sysconfig/sbd says
"Whether to delay after starting sbd on boot for "msgwait" seconds.",
but as I understand, stonith agent timeout is 2 * msgwait.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] questions about startup fencing

2017-11-30 Thread Andrei Borzenkov
On Thu, Nov 30, 2017 at 1:39 PM, Gao,Yan  wrote:
> On 11/30/2017 09:14 AM, Andrei Borzenkov wrote:
>>
>> On Wed, Nov 29, 2017 at 6:54 PM, Ken Gaillot  wrote:
>>>
>>>
>>> The same scenario is why a single node can't have quorum at start-up in
>>> a cluster with "two_node" set. Both nodes have to see each other at
>>> least once before they can assume it's safe to do anything.
>>>
>>
>> Unless we set no-quorum-policy=ignore in which case it will proceed
>> after fencing another node. As far as I'm understand this is the only
>> way to get number of active cluster nodes below quorum, right?
>
> To be safe, "two_node: 1" automatically enables "wait_for_all". Of course
> one can explicitly disable "wait_for_all" if they know what they are doing.
>

Well ...

ha1:~ # crm corosync status
Printing ring status.
Local node ID 1084766299
RING ID 0
id  = 192.168.56.91
status  = ring 0 active with no faults
Quorum information
--
Date: Thu Nov 30 19:09:57 2017
Quorum provider:  corosync_votequorum
Nodes:1
Node ID:  1084766299
Ring ID:  412
Quorate:  No

Votequorum information
--
Expected votes:   2
Highest expected: 2
Total votes:  1
Quorum:   1 Activity blocked
Flags:2Node WaitForAll

Membership information
--
Nodeid  Votes Name
1084766299  1 ha1 (local)
ha1:~ #

ha1:~ # crm_mon -1r
Stack: corosync
Current DC: ha1 (version 1.1.16-4.8-77ea74d) - partition WITHOUT quorum
Last updated: Thu Nov 30 19:08:03 2017
Last change: Thu Nov 30 11:05:03 2017 by root via cibadmin on ha1

2 nodes configured
3 resources configured

Online: [ ha1 ]
OFFLINE: [ ha2 ]

Full list of resources:

 stonith-sbd(stonith:external/sbd): Started ha1
 Master/Slave Set: ms_Stateful_1 [rsc_Stateful_1]
 Masters: [ ha1 ]
 Stopped: [ ha2 ]
ha1:~ #

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] pacemaker self stonith

2017-11-30 Thread Klaus Wenninger
On 11/30/2017 11:41 AM, Hauke Homburg wrote:
> Hallo List,
>
> I am searching für a possibility to stonith a pacemaker node himself.
>
> The Reason is ich need to check of the pacemaker noch can reach the
> network outside the local network. Because of network outage.
>
> I can't connect to an ILO interface or so.

From what I get a possibility could be using ping-resource.
Per default that would just signal the outcome of the ping-observation
via an attribute but you can configure it to actually fail the
resource if the ping-target isn't reachable.
If you then set on-fail=fence for operations start & monitor and
provided you have proper fencing in place that could already
be what you are searching for.
Wouldn't do that outside of pacemaker probably ...

Regards,
Klaus

>
> I consider a bash script:
>
> if [ ! print -c 1 8.8.8.8] then $Hardware_reset ; fi
>
> this in the crontab and run every minute.
>
> Thanks für Help
>
> Hauke
>
>
>


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] pacemaker with sbd fails to start if node reboots too fast.

2017-11-30 Thread Gao,Yan

On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:

SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
VM on VSphere using shared VMDK as SBD. During basic tests by killing
corosync and forcing STONITH pacemaker was not started after reboot.
In logs I see during boot

Nov 22 16:04:56 sapprod01s crmd[3151]: crit: We were allegedly
just fenced by sapprod01p for sapprod01p
Nov 22 16:04:56 sapprod01s pacemakerd[3137]:  warning: The crmd
process (3151) can no longer be respawned,
Nov 22 16:04:56 sapprod01s pacemakerd[3137]:   notice: Shutting down Pacemaker

SBD timeouts are 60s for watchdog and 120s for msgwait. It seems that
stonith with SBD always takes msgwait (at least, visually host is not
declared as OFFLINE until 120s passed). But VM rebots lightning fast
and is up and running long before timeout expires.

I think I have seen similar report already. Is it something that can
be fixed by SBD/pacemaker tuning?

SBD_DELAY_START=yes in /etc/sysconfig/sbd is the solution.

Regards,
  Yan



I can provide full logs tomorrow if needed.

TIA

-andrei

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] pacemaker self stonith

2017-11-30 Thread Hauke Homburg
Hallo List,

I am searching für a possibility to stonith a pacemaker node himself.

The Reason is ich need to check of the pacemaker noch can reach the
network outside the local network. Because of network outage.

I can't connect to an ILO interface or so.

I consider a bash script:

if [ ! print -c 1 8.8.8.8] then $Hardware_reset ; fi

this in the crontab and run every minute.

Thanks für Help

Hauke



-- 
www.w3-creative.de

www.westchat.de

https://friendica.westchat.de/profile/hauke


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] questions about startup fencing

2017-11-30 Thread Gao,Yan

On 11/30/2017 09:14 AM, Andrei Borzenkov wrote:

On Wed, Nov 29, 2017 at 6:54 PM, Ken Gaillot  wrote:


The same scenario is why a single node can't have quorum at start-up in
a cluster with "two_node" set. Both nodes have to see each other at
least once before they can assume it's safe to do anything.



Unless we set no-quorum-policy=ignore in which case it will proceed
after fencing another node. As far as I'm understand this is the only
way to get number of active cluster nodes below quorum, right?
To be safe, "two_node: 1" automatically enables "wait_for_all". Of 
course one can explicitly disable "wait_for_all" if they know what they 
are doing.


Regards,
  Yan




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] pcs create master/slave resource doesn't work (Ken Gaillot)

2017-11-30 Thread Hui Xiang
The really weired thing is that the monitor is only called once other than
expected repeatedly, where should I check for it?

On Thu, Nov 30, 2017 at 4:14 PM, Hui Xiang  wrote:

> Thanks Ken very much for your helpful infomation.
>
> I am now blocking on I can't see the pacemaker DC do any further
> start/promote etc action on my resource agents, no helpful logs founded.
>
> So my first question is that in what kind of situation DC will decide do
> call start action?  does the monitor operation need to be return
> OCF_SUCCESS? in my case, it will return OCF_NOT_RUNNING, and the monitor
> operation is not being called any more, which should be wrong as I felt
> that it should be called intervally.
>
> The resource agent monitor logistic:
> In the xx_monitor function it will call xx_update, and there always hit  
> "$CRM_MASTER
> -D;;" , what does it usually mean? will it stopped that start operation
> being called?
>
> ovsdb_server_master_update() {
> ocf_log info "ovsdb_server_master_update: $1}"
>
> case $1 in
> $OCF_SUCCESS)
> $CRM_MASTER -v ${slave_score};;
> $OCF_RUNNING_MASTER)
> $CRM_MASTER -v ${master_score};;
> #*) $CRM_MASTER -D;;
> esac
> ocf_log info "ovsdb_server_master_update end}"
> }
>
> ovsdb_server_monitor() {
> ocf_log info "ovsdb_server_monitor"
> ovsdb_server_check_status
> rc=$?
>
> ovsdb_server_master_update $rc
> ocf_log info "monitor is going to return $rc"
> return $rc
> }
>
>
> Below is my cluster configuration:
>
> 1. First I have an vip set.
> [root@node-1 ~]# pcs resource show
>  vip__management_old (ocf::es:ns_IPaddr2): Started node-1.domain.tld
>
> 2. Use pcs to create ovndb-servers and constraint
> [root@node-1 ~]# pcs resource create tst-ovndb ocf:ovn:ovndb-servers
> manage_northd=yes master_ip=192.168.0.2 nb_master_port=6641
> sb_master_port=6642 master
>  ([root@node-1 ~]# pcs resource meta tst-ovndb-master notify=true
>   Error: unable to find a resource/clone/master/group:
> tst-ovndb-master) ## returned error, so I changed into below command.
> [root@node-1 ~]# pcs resource master tst-ovndb-master tst-ovndb
> notify=true
> [root@node-1 ~]# pcs constraint colocation add master tst-ovndb-master
> with vip__management_old
>
> 3. pcs status
> [root@node-1 ~]# pcs status
>  vip__management_old (ocf::es:ns_IPaddr2): Started node-1.domain.tld
>  Master/Slave Set: tst-ovndb-master [tst-ovndb]
>  Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
>
> 4. pcs resource show XXX
> [root@node-1 ~]# pcs resource show  vip__management_old
>  Resource: vip__management_old (class=ocf provider=es type=ns_IPaddr2)
>   Attributes: nic=br-mgmt base_veth=br-mgmt-hapr ns_veth=hapr-m
> ip=192.168.0.2 iflabel=ka cidr_netmask=24 ns=haproxy gateway=none
> gateway_metric=0 iptables_start_rules=false iptables_stop_rules=false
> iptables_comment=default-comment
>   Meta Attrs: migration-threshold=3 failure-timeout=60
> resource-stickiness=1
>   Operations: monitor interval=3 timeout=30 (vip__management_old-monitor-3
> )
>   start interval=0 timeout=30 (vip__management_old-start-0)
>   stop interval=0 timeout=30 (vip__management_old-stop-0)
> [root@node-1 ~]# pcs resource show tst-ovndb-master
>  Master: tst-ovndb-master
>   Meta Attrs: notify=true
>   Resource: tst-ovndb (class=ocf provider=ovn type=ovndb-servers)
>Attributes: manage_northd=yes master_ip=192.168.0.2 nb_master_port=6641
> sb_master_port=6642
>Operations: start interval=0s timeout=30s (tst-ovndb-start-timeout-30s)
>stop interval=0s timeout=20s (tst-ovndb-stop-timeout-20s)
>promote interval=0s timeout=50s
> (tst-ovndb-promote-timeout-50s)
>demote interval=0s timeout=50s
> (tst-ovndb-demote-timeout-50s)
>monitor interval=30s timeout=20s
> (tst-ovndb-monitor-interval-30s)
>monitor interval=10s role=Master timeout=20s
> (tst-ovndb-monitor-interval-10s-role-Master)
>monitor interval=30s role=Slave timeout=20s
> (tst-ovndb-monitor-interval-30s-role-Slave)
>
>
> colocation colocation-tst-ovndb-master-vip__management_old-INFINITY inf:
> tst-ovndb-master:Master vip__management_old:Started
>
> 5. I have put log in every ovndb-servers op, seems only the monitor op is
> being called, no promoted by the pacemaker DC:
> <30>Nov 30 15:22:19 node-1 ovndb-servers(tst-ovndb)[2980860]: INFO:
> ovsdb_server_monitor
> <30>Nov 30 15:22:19 node-1 ovndb-servers(tst-ovndb)[2980860]: INFO:
> ovsdb_server_check_status
> <30>Nov 30 15:22:19 node-1 ovndb-servers(tst-ovndb)[2980860]: INFO:
> return OCFOCF_NOT_RUNNINGG
> <30>Nov 30 15:22:20 node-1 ovndb-servers(tst-ovndb)[2980860]: INFO:
> ovsdb_server_master_update: 7}
> <30>Nov 30 15:22:20 node-1 ovndb-servers(tst-ovndb)[2980860]: INFO:
> ovsdb_server_master_update end}
> <30>Nov 30 15:22:20 node-1 ovndb-servers(tst-ovndb)[2980860]: INFO:
> monitor is 

Re: [ClusterLabs] building from source

2017-11-30 Thread Tomas Jelinek

Dne 29.11.2017 v 16:03 Ken Gaillot napsal(a):

On Tue, 2017-11-28 at 11:23 -0800, Aaron Cody wrote:

I'm trying to build all of the pacemaker/corosync components from
source instead of using the redhat rpms - I have a few questions.

I'm building on redhat 7.2 and so far I have been able to build:

libqb 1.0.2
pacemaker 1.1.18
corosync 2.4.3
resource-agents 4.0.1

however I have not been able to build pcs yet, i'm getting ruby
errors:

sudo make install_pcsd
which: no python3 in (/sbin:/bin:/usr/sbin:/usr/bin)
make -C pcsd build_gems
make[1]: Entering directory `/home/whacuser/pcs/pcsd'
bundle package
`ruby_22` is not a valid platform. The available options are: [:ruby,
:ruby_18, :ruby_19, :ruby_20, :ruby_21, :mri, :mri_18, :mri_19,
:mri_20, :mri_21, :rbx, :jruby,
:jruby_18, :jruby_19, :mswin, :mingw, :mingw_18, :mingw_19,
:mingw_20, :mingw_21, :x64_mingw, :x64_mingw_20, :x64_mingw_21]
make[1]: *** [get_gems] Error 4
make[1]: Leaving directory `/home/whacuser/pcs/pcsd'
make: *** [install_pcsd] Error 2


Q1: Is this the complete set of components I need to build?


Not considering pcs, yes.


Q2: do I need cluster-glue?


It's only used now to be able to use heartbeat-style fence agents. If
you have what you need in Red Hat's fence agent packages, you don't
need it.


Q3: any idea how I can get past the build error with pcsd?


Modify gemfiles as described at 
https://github.com/ClusterLabs/pcs/issues/139#issuecomment-310630565



Q4: if I use the pcs rpm instead of building pcs from source, I see
an error when my cluster starts up 'unable to get cib'. This didn't
happen when I was using the redhat rpms, so i'm wondering what i'm
missing...


You mean you got the error when using RHEL 7.2 pcs rpm and pacemaker 
built from sources while when you used both pcs and pacemaker RHEL 7.2 
rpms there was no error? When and where did you get this error? It might 
be simply the case of trying to access the CIB while pacemaker is still 
starting. Or it might be some incompatibility between old pcs and new 
pacemaker. Hard to say with this little info.


Anyway, I agree with Ken regarding upgrading to 7.4 and use RHEL rpms.



thanks


pcs development is closely tied to Red Hat releases, so it's hit-or-
miss mixing and matching pcs and RHEL versions. Upgrading to RHEL 7.4
would get you recent versions of everything, though, so that would be
easiest if it's an option.



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: Regression in Filesystem RA

2017-11-30 Thread Ulrich Windl




> Hello,
> 
> sorry for the late reply, moving Date Centers tends to keep one busy.
> 
> I looked at the PR and while it works and certainly is an improvement, it
> wouldn't help me in my case much.
> Biggest issue being fuser and its exponential slowdown and the RA still
> uses this.
> 
> What I did was to recklessly force my crap code into a script:
> ---
> #/bin/bash
> lsof -n |grep $1 |grep DIR| awk '{print $2}'
> ---

Hi!

I'm not an lsof specialist, but maybe add more options to lsof, and you can
get rid of the graps and awk, maybe. I mean: lsof examines everything, and you
pick what you need. Maybe just let lsof output wha you need.

> 
> And call that instead of fuser as well as removing all kill logging by
> default (determining the number pids isn't free either). 
> 
> With that in place it can deal with 10k processes to kill in less than 10
> seconds.
> 
> Regards,
> 
> Christian
> 
> On Tue, 24 Oct 2017 09:07:50 +0200 Dejan Muhamedagic wrote:
> 
>> On Tue, Oct 24, 2017 at 08:59:17AM +0200, Dejan Muhamedagic wrote:
>> > [...]
>> > I just made a pull request:
>> > 
>> > https://github.com/ClusterLabs/resource-agents/pull/1042  
>> 
>> NB: It is completely untested!
>> 
>> > It would be great if you could test it!
>> > 
>> > Cheers,
>> > 
>> > Dejan
>> >   
>> > > Regards,
>> > > 
>> > > Christian
>> > >   
>> > > > > Maybe we can even come up with a way
>> > > > > to both "pretty print" and kill fast?
>> > > > 
>> > > > My best guess right now is no ;-) But we could log nicely for the
>> > > > usual case of a small number of stray processes ... maybe
>> > > > something like this:
>> > > > 
>> > > >i=""
>> > > >get_pids | tr '\n' ' ' | fold -s |
>> > > >while read procs; do
>> > > >if [ -z "$i" ]; then
>> > > >killnlog $procs
>> > > >i="nolog"
>> > > >else
>> > > >justkill $procs
>> > > >fi
>> > > >done
>> > > > 
>> > > > Cheers,
>> > > > 
>> > > > Dejan
>> > > >   
>> > > > > -- 
>> > > > > : Lars Ellenberg
>> > > > > : LINBIT | Keeping the Digital World Running
>> > > > > : DRBD -- Heartbeat -- Corosync -- Pacemaker
>> > > > > : R, Integration, Ops, Consulting, Support
>> > > > > 
>> > > > > DRBD® and LINBIT® are registered trademarks of LINBIT
>> > > > > 
>> > > > > ___
>> > > > > Users mailing list: Users@clusterlabs.org 
>> > > > > http://lists.clusterlabs.org/mailman/listinfo/users 
>> > > > > 
>> > > > > Project Home: http://www.clusterlabs.org 
>> > > > > Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> > > > > Bugs: http://bugs.clusterlabs.org
>> > > > 
>> > > > ___
>> > > > Users mailing list: Users@clusterlabs.org 
>> > > > http://lists.clusterlabs.org/mailman/listinfo/users 
>> > > > 
>> > > > Project Home: http://www.clusterlabs.org 
>> > > > Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> > > > Bugs: http://bugs.clusterlabs.org 
>> > > >   
>> > > 
>> > > 
>> > > -- 
>> > > Christian BalzerNetwork/Systems Engineer
>> > > ch...@gol.comRakuten Communications  
>> > 
>> > ___
>> > Users mailing list: Users@clusterlabs.org 
>> > http://lists.clusterlabs.org/mailman/listinfo/users 
>> > 
>> > Project Home: http://www.clusterlabs.org 
>> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf

>> > Bugs: http://bugs.clusterlabs.org  
>> 
>> ___
>> Users mailing list: Users@clusterlabs.org 
>> http://lists.clusterlabs.org/mailman/listinfo/users 
>> 
>> Project Home: http://www.clusterlabs.org 
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> Bugs: http://bugs.clusterlabs.org 
>> 
> 
> 
> -- 
> Christian BalzerNetwork/Systems Engineer
> ch...@gol.com Rakuten Communications
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] pcs create master/slave resource doesn't work (Ken Gaillot)

2017-11-30 Thread Hui Xiang
Thanks Ken very much for your helpful infomation.

I am now blocking on I can't see the pacemaker DC do any further
start/promote etc action on my resource agents, no helpful logs founded.

So my first question is that in what kind of situation DC will decide do
call start action?  does the monitor operation need to be return
OCF_SUCCESS? in my case, it will return OCF_NOT_RUNNING, and the monitor
operation is not being called any more, which should be wrong as I felt
that it should be called intervally.

The resource agent monitor logistic:
In the xx_monitor function it will call xx_update, and there always
hit  "$CRM_MASTER
-D;;" , what does it usually mean? will it stopped that start operation
being called?

ovsdb_server_master_update() {
ocf_log info "ovsdb_server_master_update: $1}"

case $1 in
$OCF_SUCCESS)
$CRM_MASTER -v ${slave_score};;
$OCF_RUNNING_MASTER)
$CRM_MASTER -v ${master_score};;
#*) $CRM_MASTER -D;;
esac
ocf_log info "ovsdb_server_master_update end}"
}

ovsdb_server_monitor() {
ocf_log info "ovsdb_server_monitor"
ovsdb_server_check_status
rc=$?

ovsdb_server_master_update $rc
ocf_log info "monitor is going to return $rc"
return $rc
}


Below is my cluster configuration:

1. First I have an vip set.
[root@node-1 ~]# pcs resource show
 vip__management_old (ocf::es:ns_IPaddr2): Started node-1.domain.tld

2. Use pcs to create ovndb-servers and constraint
[root@node-1 ~]# pcs resource create tst-ovndb ocf:ovn:ovndb-servers
manage_northd=yes master_ip=192.168.0.2 nb_master_port=6641
sb_master_port=6642 master
 ([root@node-1 ~]# pcs resource meta tst-ovndb-master notify=true
  Error: unable to find a resource/clone/master/group:
tst-ovndb-master) ## returned error, so I changed into below command.
[root@node-1 ~]# pcs resource master tst-ovndb-master tst-ovndb notify=true
[root@node-1 ~]# pcs constraint colocation add master tst-ovndb-master with
vip__management_old

3. pcs status
[root@node-1 ~]# pcs status
 vip__management_old (ocf::es:ns_IPaddr2): Started node-1.domain.tld
 Master/Slave Set: tst-ovndb-master [tst-ovndb]
 Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]

4. pcs resource show XXX
[root@node-1 ~]# pcs resource show  vip__management_old
 Resource: vip__management_old (class=ocf provider=es type=ns_IPaddr2)
  Attributes: nic=br-mgmt base_veth=br-mgmt-hapr ns_veth=hapr-m
ip=192.168.0.2 iflabel=ka cidr_netmask=24 ns=haproxy gateway=none
gateway_metric=0 iptables_start_rules=false iptables_stop_rules=false
iptables_comment=default-comment
  Meta Attrs: migration-threshold=3 failure-timeout=60
resource-stickiness=1
  Operations: monitor interval=3 timeout=30 (vip__management_old-monitor-3)
  start interval=0 timeout=30 (vip__management_old-start-0)
  stop interval=0 timeout=30 (vip__management_old-stop-0)
[root@node-1 ~]# pcs resource show tst-ovndb-master
 Master: tst-ovndb-master
  Meta Attrs: notify=true
  Resource: tst-ovndb (class=ocf provider=ovn type=ovndb-servers)
   Attributes: manage_northd=yes master_ip=192.168.0.2 nb_master_port=6641
sb_master_port=6642
   Operations: start interval=0s timeout=30s (tst-ovndb-start-timeout-30s)
   stop interval=0s timeout=20s (tst-ovndb-stop-timeout-20s)
   promote interval=0s timeout=50s (tst-ovndb-promote-timeout-
50s)
   demote interval=0s timeout=50s (tst-ovndb-demote-timeout-50s)
   monitor interval=30s timeout=20s (tst-ovndb-monitor-interval-
30s)
   monitor interval=10s role=Master timeout=20s
(tst-ovndb-monitor-interval-10s-role-Master)
   monitor interval=30s role=Slave timeout=20s
(tst-ovndb-monitor-interval-30s-role-Slave)


colocation colocation-tst-ovndb-master-vip__management_old-INFINITY inf:
tst-ovndb-master:Master vip__management_old:Started

5. I have put log in every ovndb-servers op, seems only the monitor op is
being called, no promoted by the pacemaker DC:
<30>Nov 30 15:22:19 node-1 ovndb-servers(tst-ovndb)[2980860]: INFO:
ovsdb_server_monitor
<30>Nov 30 15:22:19 node-1 ovndb-servers(tst-ovndb)[2980860]: INFO:
ovsdb_server_check_status
<30>Nov 30 15:22:19 node-1 ovndb-servers(tst-ovndb)[2980860]: INFO: return
OCFOCF_NOT_RUNNINGG
<30>Nov 30 15:22:20 node-1 ovndb-servers(tst-ovndb)[2980860]: INFO:
ovsdb_server_master_update: 7}
<30>Nov 30 15:22:20 node-1 ovndb-servers(tst-ovndb)[2980860]: INFO:
ovsdb_server_master_update end}
<30>Nov 30 15:22:20 node-1 ovndb-servers(tst-ovndb)[2980860]: INFO: monitor
is going to return 7
<30>Nov 30 15:22:20 node-1 ovndb-servers(undef)[2980970]: INFO: metadata
exit OCF_SUCCESS}

6. The cluster property:
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.12-a14efad \
cluster-infrastructure=corosync \
no-quorum-policy=ignore \
stonith-enabled=false \
symmetric-cluster=false \
last-lrm-refresh=1511802933




Re: [ClusterLabs] questions about startup fencing

2017-11-30 Thread Andrei Borzenkov
On Wed, Nov 29, 2017 at 6:54 PM, Ken Gaillot  wrote:
>
> The same scenario is why a single node can't have quorum at start-up in
> a cluster with "two_node" set. Both nodes have to see each other at
> least once before they can assume it's safe to do anything.
>

Unless we set no-quorum-policy=ignore in which case it will proceed
after fencing another node. As far as I'm understand this is the only
way to get number of active cluster nodes below quorum, right?

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Is corosync supposed to be restarted if it fies?

2017-11-30 Thread Andrei Borzenkov
On Thu, Nov 30, 2017 at 12:42 AM, Jan Pokorný  wrote:
> On 29/11/17 22:00 +0100, Jan Pokorný wrote:
>> On 28/11/17 22:35 +0300, Andrei Borzenkov wrote:
>>> 28.11.2017 13:01, Jan Pokorný пишет:
 On 27/11/17 17:43 +0300, Andrei Borzenkov wrote:
> Отправлено с iPhone
>
>> 27 нояб. 2017 г., в 14:36, Ferenc Wágner  написал(а):
>>
>> Andrei Borzenkov  writes:
>>
>>> 25.11.2017 10:05, Andrei Borzenkov пишет:
>>>
 In one of guides suggested procedure to simulate split brain was to 
 kill
 corosync process. It actually worked on one cluster, but on another
 corosync process was restarted after being killed without cluster
 noticing anything. Except after several attempts pacemaker died with
 stopping resources ... :)

 This is SLES12 SP2; I do not see any Restart in service definition so 
 it
 probably not systemd.

>>> FTR - it was not corosync, but pacemakker; its unit file specifies
>>> RestartOn=error so killing corosync caused pacemaker to fail and be
>>> restarted by systemd.
>>
>> And starting corosync via a Requires dependency?
>
> Exactly.

 From my testing it looks like we should change
 "Requires=corosync.service" to "BindsTo=corosync.service"
 in pacemaker.service.

 Could you give it a try?

>>>
>>> I'm not sure what is expected outcome, but pacemaker.service is still
>>> restarted (due to Restart=on-failure).
>>
>> Expected outcome is that pacemaker.service will become
>> "inactive (dead)" after killing corosync (as a result of being
>> "bound" by pacemaker).  Have you indeed issued "systemctl
>> daemon-reload" after updating the pacemaker unit file?
>>

Of course. I even rebooted ... :)

ha1:~ # systemctl cat pacemaker.service  | grep corosync
After=corosync.service
BindsTo=corosync.service
# ExecStopPost=/bin/sh -c 'pidof crmd || killall -TERM corosync'
ha1:~ #

Nov 30 10:41:14 ha1 sbd[1743]:cluster:error:
pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2)
Nov 30 10:41:14 ha1 systemd[1]: corosync.service: Main process exited,
code=killed, status=9/KILL
Nov 30 10:41:14 ha1 sbd[1743]:cluster:  warning:
sbd_membership_destroy: Lost connection to corosync
Nov 30 10:41:14 ha1 systemd[1]: pacemaker.service: Main process
exited, code=exited, status=107/n/a
Nov 30 10:41:14 ha1 sbd[1743]:cluster:error:
set_servant_health: Cluster connection terminated
Nov 30 10:41:14 ha1 systemd[1]: Stopped Pacemaker High Availability
Cluster Manager.
Nov 30 10:41:14 ha1 sbd[1743]:cluster:error:
cluster_connect_cpg: Could not connect to the Cluster Process Group
API: 2
Nov 30 10:41:14 ha1 systemd[1]: pacemaker.service: Unit entered failed state.
Nov 30 10:41:14 ha1 sbd[1739]:  warning: inquisitor_child: cluster
health check: UNHEALTHY
Nov 30 10:41:14 ha1 systemd[1]: pacemaker.service: Failed with result
'exit-code'.
...
Nov 30 10:41:14 ha1 systemd[1]: corosync.service: Unit entered failed state.
Nov 30 10:41:14 ha1 systemd[1]: corosync.service: Failed with result 'signal'.
Nov 30 10:41:14 ha1 systemd[1]: pacemaker.service: Service hold-off
time over, scheduling restart.
Nov 30 10:41:14 ha1 systemd[1]: Stopped Pacemaker High Availability
Cluster Manager.
Nov 30 10:41:14 ha1 systemd[1]: Starting Corosync Cluster Engine...

Do you mean you get different results? Do not forget that the only
thing BindsTo does is to stop service is dependency failed; it does
*not* affect decision whether to restart service in any way (at least
directly).


>> (FTR, I tried with systemd 235).
>>

Well ... what we have here is race condition. We have two events -
corosync.service and pacemaker.service *independent* failures and two
(re-)actions - stop pacemaker.service in response to the former (due
to BindsTo) and restart pacemaker.service in response to the latter
(due to Restart=on-failure). The final result depends on the order in
which systemd gets those events and schedules actions (and relative
timing when those actions complete) and this is not deterministic.

Now 235 includes some changes to restart logic which refuses to do
restart if other action (like stop) is currently being scheduled. I am
not sure what happens if restart is scheduled first though (such
"implementation details" tend to be not documented in systemd world).
I have been doing systemd troubleshooting for a long time to know that
even if you observe specific sequence of events, another system may
exhibit completely different sequence.

Anyway, I will try to install system with 235 on the same platform to
see how it behaves.

>>> If intention is to unconditionally stop it when corosync dies,
>>> pacemaker should probably exit with unique code and unit files have
>>> RestartPreventExitStatus set to it.
>>
>> That would be an elaborate way to reach the same.
>>

This is the *only* way to reach