Re: [ClusterLabs] What's the number in "Servant pcmk is outdated (age: 682915)"

2022-06-01 Thread Gao,Yan via Users

Hi Ulrich,

On 2022/6/1 7:59, Ulrich Windl wrote:

Hi!

I'm wondering what the number in parentheses is for these messages:
sbd[6809]:  warning: inquisitor_child: pcmk health check: UNHEALTHY
sbd[6809]:  warning: inquisitor_child: Servant pcmk is outdated (age: 682915)


As we know, each sbd watcher daemon (servant) is supposed to report 
"healthy" status to sbd inquisitor daemon timely if the object is fine. 
In here, the sbd pcmk servant proactively reported "unhealthy" status.


The "age" value from the log message in this case with pcmk/cluster 
servant is indeed confusing though, since internally there's a coding 
trick for this case to intentionally make any previous "healthy" status 
directly aged (outdated). The value itself here is basically the tv_sec 
value from clock_gettime() minus 1, which is not really meaningful for 
users.


Regards,
  Yan



Regards,
Ulrich


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Failed migration causing fencing loop

2022-05-25 Thread Gao,Yan via Users

Hi Ulrich,

On 2022/3/31 11:18, Gao,Yan via Users wrote:

On 2022/3/31 9:03, Ulrich Windl wrote:

Hi!

I just wanted to point out one thing that hit us with SLES15 SP3:
Some failed live VM migration causing node fencing resulted in a 
fencing loop, because of two reasons:


1) Pacemaker thinks that even _after_ fencing there is some migration 
to "clean up". Pacemaker treats the situation as if the VM is running 
on both nodes, thus (50% chance?) trying to stop the VM on the node 
that just booted after fencing. That's supid but shouldn't be fatal IF 
there weren't...


2) The stop operation of the VM (that atually isn't running) fails,


AFAICT it could not connect to the hypervisor, but the logic in the RA 
is kind of arguable that the probe (monitor) of the VM returned "not 
running", but the stop right after that returned failure...


OTOH, the point about pacemaker is the stop of the resource on the 
fenced and rejoined node is not really necessary. There has been 
discussions about this here and we are trying to figure out a solution 
for it:


https://github.com/ClusterLabs/pacemaker/pull/2146#discussion_r828204919


FYI, this issue has been addressed with:
https://github.com/ClusterLabs/pacemaker/pull/2705

Regards,
  Yan



For now it requires administrator's intervene if the situation happens:
1) Fix the access to hypervisor before the fenced node rejoins.
2) Manually cleanup the resource, which tells pacemaker it can safely 
forget the historical migrate_to failure.


Regards,
   Yan


causing a node fence. So the loop is complete.

Some details (many unrelated messages left out):

Mar 30 16:06:14 h16 libvirtd[13637]: internal error: libxenlight 
failed to restore domain 'v15'


Mar 30 16:06:15 h19 pacemaker-schedulerd[7350]:  warning: Unexpected 
result (error: v15: live migration to h16 failed: 1) was recorded for 
migrate_to of prm_xen_v15 on h18 at Mar 30 16:06:13 2022


Mar 30 16:13:37 h19 pacemaker-schedulerd[7350]:  warning: Unexpected 
result (OCF_TIMEOUT) was recorded for stop of prm_libvirtd:0 on h18 at 
Mar 30 16:13:36 2022
Mar 30 16:13:37 h19 pacemaker-schedulerd[7350]:  warning: Unexpected 
result (OCF_TIMEOUT) was recorded for stop of prm_libvirtd:0 on h18 at 
Mar 30 16:13:36 2022
Mar 30 16:13:37 h19 pacemaker-schedulerd[7350]:  warning: Cluster node 
h18 will be fenced: prm_libvirtd:0 failed there


Mar 30 16:19:00 h19 pacemaker-schedulerd[7350]:  warning: Unexpected 
result (error: v15: live migration to h18 failed: 1) was recorded for 
migrate_to of prm_xen_v15 on h16 at Mar 29 23:58:40 2022
Mar 30 16:19:00 h19 pacemaker-schedulerd[7350]:  error: Resource 
prm_xen_v15 is active on 2 nodes (attempting recovery)


Mar 30 16:19:00 h19 pacemaker-schedulerd[7350]:  notice:  * Restart
prm_xen_v15  ( h18 )


Mar 30 16:19:04 h18 VirtualDomain(prm_xen_v15)[8768]: INFO: Virtual 
domain v15 currently has no state, retrying.
Mar 30 16:19:05 h18 VirtualDomain(prm_xen_v15)[8787]: INFO: Virtual 
domain v15 currently has no state, retrying.
Mar 30 16:19:07 h18 VirtualDomain(prm_xen_v15)[8822]: ERROR: Virtual 
domain v15 has no state during stop operation, bailing out.
Mar 30 16:19:07 h18 VirtualDomain(prm_xen_v15)[8836]: INFO: Issuing 
forced shutdown (destroy) request for domain v15.
Mar 30 16:19:07 h18 VirtualDomain(prm_xen_v15)[8860]: ERROR: forced 
stop failed


Mar 30 16:19:07 h19 pacemaker-controld[7351]:  notice: Transition 124 
action 115 (prm_xen_v15_stop_0 on h18): expected 'ok' but got 'error'


Note: Our cluster nodes start pacemaker during boot. Yesterday I was 
there when the problem happened. But as we had another boot loop some 
time ago I wrote a systemd service that counts boots, and if too many 
happen within a short time, pacemaker will be disabled on that node. 
As it it set now, the counter is reset if the node is up for at least 
15 minutes; if it fails more than 4 times to do so, pacemaker will be 
disabled. If someone wants to try that or give feedback, drop me a 
line, so I could provide the RPM (boot-loop-handler-0.0.5-0.0.noarch)...


Regards,
Ulrich



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] More pacemaker oddities while stopping DC

2022-05-25 Thread Gao,Yan via Users

On 2022/5/25 8:10, Ulrich Windl wrote:

Hi!

We are still suffering from kernel RAM corruption on the Xen hypervisor when a 
VM or the hypervisor is doing I/O (three months since the bug report at SUSE, 
but no fix or workaround meaning the whole Xen cluster project was canceled 
after 20 years, but that's a different topic). All VMs will be migrated to 
VMware, dumping the whole SLES15 Xen cluster very soon.

My script that detected RAM corruption tried to shutdown pacemaker, hoping for 
the best (i.e. VMs to be live-migrated away). However there are very strange 
decisions made (pacemaker-2.0.5+20201202.ba59be712-150300.4.21.1.x86_64):

May 24 17:05:07 h16 VirtualDomain(prm_xen_test-jeos7)[24460]: INFO: test-jeos7: 
live migration to h19 succeeded.
May 24 17:05:07 h16 VirtualDomain(prm_xen_test-jeos9)[24463]: INFO: test-jeos9: 
live migration to h19 succeeded.
May 24 17:05:07 h16 pacemaker-execd[7504]:  notice: prm_xen_test-jeos7 
migrate_to (call 321, PID 24281) exited with status 0 (execution time 5500ms, 
queue time 0ms)
May 24 17:05:07 h16 pacemaker-controld[7509]:  notice: Result of migrate_to 
operation for prm_xen_test-jeos7 on h16: ok
May 24 17:05:07 h16 pacemaker-execd[7504]:  notice: prm_xen_test-jeos9 
migrate_to (call 323, PID 24283) exited with status 0 (execution time 5514ms, 
queue time 0ms)
May 24 17:05:07 h16 pacemaker-controld[7509]:  notice: Result of migrate_to 
operation for prm_xen_test-jeos9 on h16: ok

Would you agree that the migration was successful? I'd say YES!


Maybe practically yes with what migrate_to has achieved with 
VirtualDomain RA, but technically no from pacemaker's point of view.


Following the migrate_to on the source node, a migrate_from operation on 
the target node and a stop operation on the source node will be needed 
to eventually make a successful live-migration.




However this is what happened:

May 24 17:05:19 h16 pacemaker-controld[7509]:  notice: Transition 2460 
(Complete=16, Pending=0, Fired=0, Skipped=7, Incomplete=57, 
Source=/var/lib/pacemaker/pengine/pe-input-89.bz2): Stopped
May 24 17:05:19 h16 pacemaker-schedulerd[7508]:  warning: Unexpected result 
(error) was recorded for stop of prm_ping_gw1:1 on h16 at May 24 17:05:02 2022
May 24 17:05:19 h16 pacemaker-schedulerd[7508]:  warning: Unexpected result 
(error) was recorded for stop of prm_ping_gw1:1 on h16 at May 24 17:05:02 2022
May 24 17:05:19 h16 pacemaker-schedulerd[7508]:  warning: Cluster node h16 will 
be fenced: prm_ping_gw1:1 failed there
May 24 17:05:19 h16 pacemaker-schedulerd[7508]:  warning: Unexpected result 
(error) was recorded for stop of prm_iotw-md10:1 on h16 at May 24 17:05:02 2022
May 24 17:05:19 h16 pacemaker-schedulerd[7508]:  warning: Unexpected result 
(error) was recorded for stop of prm_iotw-md10:1 on h16 at May 24 17:05:02 2022
May 24 17:05:19 h16 pacemaker-schedulerd[7508]:  warning: Forcing cln_ping_gw1 
away from h16 after 100 failures (max=100)
May 24 17:05:19 h16 pacemaker-schedulerd[7508]:  warning: Forcing cln_ping_gw1 
away from h16 after 100 failures (max=100)
May 24 17:05:19 h16 pacemaker-schedulerd[7508]:  warning: Forcing cln_ping_gw1 
away from h16 after 100 failures (max=100)
May 24 17:05:19 h16 pacemaker-schedulerd[7508]:  warning: Forcing cln_iotw-md10 
away from h16 after 100 failures (max=100)
May 24 17:05:19 h16 pacemaker-schedulerd[7508]:  warning: Forcing cln_iotw-md10 
away from h16 after 100 failures (max=100)
May 24 17:05:19 h16 pacemaker-schedulerd[7508]:  warning: Forcing cln_iotw-md10 
away from h16 after 100 failures (max=100)
May 24 17:05:19 h16 pacemaker-schedulerd[7508]:  notice: Resource 
prm_xen_test-jeos7 can no longer migrate from h16 to h19 (will stop on both 
nodes)
May 24 17:05:19 h16 pacemaker-schedulerd[7508]:  notice: Resource 
prm_xen_test-jeos9 can no longer migrate from h16 to h19 (will stop on both 
nodes)
May 24 17:05:19 h16 pacemaker-schedulerd[7508]:  warning: Scheduling Node h16 
for STONITH

So the DC considers the migration to have failed, even though it was reported 
as success!


A so-called partial live-migration could no longer continue here.

Regards,
  Yan


(The ping had dumped core due to RAM corruption before)

May 24 17:03:12 h16 kernel: ping[23973]: segfault at 213e6 ip 000213e6 
sp 7ffc249fab78 error 14 in bash[5655262bc000+f1000]

So it stopped the VMs that were migrated successfully before:
May 24 17:05:19 h16 pacemaker-controld[7509]:  notice: Initiating stop 
operation prm_xen_test-jeos7_stop_0 on h19
May 24 17:05:19 h16 pacemaker-controld[7509]:  notice: Initiating stop 
operation prm_xen_test-jeos9_stop_0 on h19
May 24 17:05:19 h16 pacemaker-controld[7509]:  notice: Requesting fencing 
(reboot) of node h16

Those test VMs were not important, but the important part was that due to the 
failure to stop the ping resource, it did not even try to migrate the other VMs 
(non-test) away, so those were hard-fenced.

For completeness I should 

Re: [ClusterLabs] Antw: Instable SLES15 SP3 kernel

2022-04-27 Thread Gao,Yan via Users

Hi Ulrich,

On 2022/4/27 11:13, Ulrich Windl wrote:

Update for the Update:

I had installed SLES Updates in one VM and rebooted it via cluster. While
installing the updates in the VM the Xen host got RAM corruption (it seems any
disk I/O on the host, either locally or via a VM image causes RAM corruption):


I totally understand your frustrations on this, but I don't really see 
how much the potential kernel issue is relevant to this mailing list.


I believe SUSE support has been working and trying to address it and 
they will update you once there's further progress.


About the topics related to cluster, please find the comments in below.



Apr 27 10:56:44 h19 kernel: pacemaker-execd[39797]: segfault at 3a46 ip
3a46 sp 7ffd1c92e8e8 error 14 in
pacemaker-execd[5565921cc000+b000]

Fortunately that wasn't fatal and my rescue script kicked in before things get
really bad:
Apr 27 11:00:01 h19 reboot-before-panic[40630]: RAM corruption detected,
starting pro-active reboot

All VMs could be live-migrated away before reboot, but this SLES release is
completely unusable!

Regards,
Ulrich




Ulrich Windl schrieb am 27.04.2022 um 08:02 in Nachricht <6268DC91.C1D :

161 :
60728>:

Hi!

I want to give a non-update on the issue:
The kernel still segfaults random processes, and there is really nothing
from support within two months that could help improve the situation.
The cluster is logging all kinds on non-funny messages like these:

Apr 27 02:20:49 h18 systemd-coredump[22319]: [] Process 22317 (controld)
of user 0 dumped core.
Apr 27 02:20:49 h18 kernel: BUG: Bad rss-counter state mm:246ea08b
idx:1 val:3
Apr 27 02:20:49 h18 kernel: BUG: Bad rss-counter state mm:259b58a0
idx:1 val:7
Apr 27 02:20:49 h18 controld(prm_DLM)[22330]: ERROR: Uncontrolled lockspace



exists, system must reboot. Executing suicide fencing

For a hypervisor host this means that many VMs are reset the hard way!
Other resources weren't stopped properly, too, of course.


There also two NULL-pointer outputs in messages on the DC:
Apr 27 02:21:06 h16 dlm_stonith[39797]: stonith_api_time: Found 18 entries
for 118/(null): 0 in progress, 17 completed
Apr 27 02:21:06 h16 dlm_stonith[39797]: stonith_api_time: Node 118/(null)
last kicked at: 1650418762

I guess that NULL pointer should have been the host name (h18) in reality.


It's as expected being NULL here. DLM requests fencing through 
pacemaker's stonith api targeting a node by its corosync nodeid (118 
here), which it has the knowledge of, rather than the node name. 
Pacemaker will do the interpretation and eventually issue the fencing.




Also it seems h18 fenced itself, and the DC h16 seeing that wants to fence
again (to make sure, maybe), but there is some odd problem:

Apr 27 02:21:07 h16 pacemaker-controld[7453]:  notice: Requesting fencing
(reboot) of node h18
Apr 27 02:21:07 h16 pacemaker-fenced[7443]:  notice: Client
pacemaker-controld.7453.a9d67c8b wants to fence (reboot) 'h18' with device
'(any)'
Apr 27 02:21:07 h16 pacemaker-fenced[7443]:  notice: Merging stonith action



'reboot' targeting h18 originating from client
pacemaker-controld.7453.73d8bbd6 with identical request from
stonith-api.39797@h16.ea22f429 (360>


This is also as expected when DLM is used. Despite the fencing 
previously proactively requested by DLM, pacemaker also has its reason 
to issue a fencing targeting the node. And fenced daemons is aware 
there's already the pending/on-going fencing targeting the same node, so 
it doesn't really need to issue it once again.




Apr 27 02:22:52 h16 pacemaker-fenced[7443]:  warning: fence_legacy_reboot_1



process (PID 39749) timed out
Apr 27 02:22:52 h16 pacemaker-fenced[7443]:  warning:
fence_legacy_reboot_1[39749] timed out after 12ms
Apr 27 02:22:52 h16 pacemaker-fenced[7443]:  error: Operation 'reboot'
[39749] (call 2 from stonith_admin.controld.22336) for host 'h18' with

device

'prm_stonith_sbd' returned: -62 (Timer expired)


Please make sure:
stonith-timeout > sbd_msgwait + pcmk_delay_max

If it was already the case, probably sbd was encountering certain 
difficulties writing the poison pill at that time ...


Regards,
  Yan



I never saw such message before. Evenbtually:

Apr 27 02:24:53 h16 pacemaker-controld[7453]:  notice: Stonith operation
31/1:3347:0:48bafcab-fecf-4ea0-84a8-c31ab1694b3a: OK (0)
Apr 27 02:24:53 h16 pacemaker-controld[7453]:  notice: Peer h18 was
terminated (reboot) by h16 on behalf of pacemaker-controld.7453: OK

The olny thing I found out was that the kernel running without Xen does not



show RAM corruption.

Regards,
Ulrich










___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: Re: Antw: [EXT] Re: Failed migration causing fencing loop

2022-04-04 Thread Gao,Yan via Users

On 2022/4/4 8:58, Ulrich Windl wrote:

Andrei Borzenkov  schrieb am 04.04.2022 um 06:39 in

Nachricht :

On 31.03.2022 14:02, Ulrich Windl wrote:

"Gao,Yan"  schrieb am 31.03.2022 um 11:18 in Nachricht

<67785c2f‑f875‑cb16‑608b‑77d63d9b0...@suse.com>:

On 2022/3/31 9:03, Ulrich Windl wrote:

Hi!

I just wanted to point out one thing that hit us with SLES15 SP3:
Some failed live VM migration causing node fencing resulted in a fencing



loop, because of two reasons:


1) Pacemaker thinks that even _after_ fencing there is some migration to



"clean up". Pacemaker treats the situation as if the VM is running on both



nodes, thus (50% chance?) trying to stop the VM on the node that just

booted



after fencing. That's supid but shouldn't be fatal IF there weren't...


2) The stop operation of the VM (that atually isn't running) fails,


AFAICT it could not connect to the hypervisor, but the logic in the RA
is kind of arguable that the probe (monitor) of the VM returned "not
running", but the stop right after that returned failure...

OTOH, the point about pacemaker is the stop of the resource on the
fenced and rejoined node is not really necessary. There has been
discussions about this here and we are trying to figure out a solution
for it:

https://github.com/ClusterLabs/pacemaker/pull/2146#discussion_r828204919

For now it requires administrator's intervene if the situation happens:
1) Fix the access to hypervisor before the fenced node rejoins.


Thanks for the explanation!

Unfortunately this can be tricky if libvirtd is involved (as it is here):
libvird uses locking (virtlockd), which in turn needs a cluster‑wird

filesystem for locks across the nodes.

When that filesystem is provided by the cluster, it's hard to delay node

joining until filesystem,  virtlockd and libvirtd are running.




So do not use filesystem provided by the same cluster. Use separate
filesystem mounted outside of cluster, like separate high‑available NFS.


Hi!

Having a second cluster just pto provide VM locking seems a big overkill.
Actually I absolutely regret that I ever followed the advice to use libvirt
and VIrtualDomain as it seems to have no real benefit for Xen and PVMs.
As a matter of fact after more than 10 years using Xen PVMs in a cluster we
will move to VMware as SLES15 SP3 is the most unstable SLES ever seen (I
started with SLES 8).
SUSE support seems unable to either fix the memory corruption, or to provide a
kernel that does not have it (it seems SP2 did not have it).


Sounds like there's certain kernel issue related to Xen? Probably ask 
SUSE support to raise the priority of the ticket?


Regards,
  Yan




Regards,
Ulrich




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Failed migration causing fencing loop

2022-03-31 Thread Gao,Yan via Users

On 2022/3/31 9:03, Ulrich Windl wrote:

Hi!

I just wanted to point out one thing that hit us with SLES15 SP3:
Some failed live VM migration causing node fencing resulted in a fencing loop, 
because of two reasons:

1) Pacemaker thinks that even _after_ fencing there is some migration to "clean 
up". Pacemaker treats the situation as if the VM is running on both nodes, thus (50% 
chance?) trying to stop the VM on the node that just booted after fencing. That's supid 
but shouldn't be fatal IF there weren't...

2) The stop operation of the VM (that atually isn't running) fails,


AFAICT it could not connect to the hypervisor, but the logic in the RA 
is kind of arguable that the probe (monitor) of the VM returned "not 
running", but the stop right after that returned failure...


OTOH, the point about pacemaker is the stop of the resource on the 
fenced and rejoined node is not really necessary. There has been 
discussions about this here and we are trying to figure out a solution 
for it:


https://github.com/ClusterLabs/pacemaker/pull/2146#discussion_r828204919

For now it requires administrator's intervene if the situation happens:
1) Fix the access to hypervisor before the fenced node rejoins.
2) Manually cleanup the resource, which tells pacemaker it can safely 
forget the historical migrate_to failure.


Regards,
  Yan


causing a node fence. So the loop is complete.

Some details (many unrelated messages left out):

Mar 30 16:06:14 h16 libvirtd[13637]: internal error: libxenlight failed to 
restore domain 'v15'

Mar 30 16:06:15 h19 pacemaker-schedulerd[7350]:  warning: Unexpected result 
(error: v15: live migration to h16 failed: 1) was recorded for migrate_to of 
prm_xen_v15 on h18 at Mar 30 16:06:13 2022

Mar 30 16:13:37 h19 pacemaker-schedulerd[7350]:  warning: Unexpected result 
(OCF_TIMEOUT) was recorded for stop of prm_libvirtd:0 on h18 at Mar 30 16:13:36 
2022
Mar 30 16:13:37 h19 pacemaker-schedulerd[7350]:  warning: Unexpected result 
(OCF_TIMEOUT) was recorded for stop of prm_libvirtd:0 on h18 at Mar 30 16:13:36 
2022
Mar 30 16:13:37 h19 pacemaker-schedulerd[7350]:  warning: Cluster node h18 will 
be fenced: prm_libvirtd:0 failed there

Mar 30 16:19:00 h19 pacemaker-schedulerd[7350]:  warning: Unexpected result 
(error: v15: live migration to h18 failed: 1) was recorded for migrate_to of 
prm_xen_v15 on h16 at Mar 29 23:58:40 2022
Mar 30 16:19:00 h19 pacemaker-schedulerd[7350]:  error: Resource prm_xen_v15 is 
active on 2 nodes (attempting recovery)

Mar 30 16:19:00 h19 pacemaker-schedulerd[7350]:  notice:  * Restart
prm_xen_v15  ( h18 )

Mar 30 16:19:04 h18 VirtualDomain(prm_xen_v15)[8768]: INFO: Virtual domain v15 
currently has no state, retrying.
Mar 30 16:19:05 h18 VirtualDomain(prm_xen_v15)[8787]: INFO: Virtual domain v15 
currently has no state, retrying.
Mar 30 16:19:07 h18 VirtualDomain(prm_xen_v15)[8822]: ERROR: Virtual domain v15 
has no state during stop operation, bailing out.
Mar 30 16:19:07 h18 VirtualDomain(prm_xen_v15)[8836]: INFO: Issuing forced 
shutdown (destroy) request for domain v15.
Mar 30 16:19:07 h18 VirtualDomain(prm_xen_v15)[8860]: ERROR: forced stop failed

Mar 30 16:19:07 h19 pacemaker-controld[7351]:  notice: Transition 124 action 
115 (prm_xen_v15_stop_0 on h18): expected 'ok' but got 'error'

Note: Our cluster nodes start pacemaker during boot. Yesterday I was there when 
the problem happened. But as we had another boot loop some time ago I wrote a 
systemd service that counts boots, and if too many happen within a short time, 
pacemaker will be disabled on that node. As it it set now, the counter is reset 
if the node is up for at least 15 minutes; if it fails more than 4 times to do 
so, pacemaker will be disabled. If someone wants to try that or give feedback, 
drop me a line, so I could provide the RPM 
(boot-loop-handler-0.0.5-0.0.noarch)...

Regards,
Ulrich



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] weird xml snippet in "crm configure show"

2021-02-12 Thread Gao,Yan

Hi,

On 2021/2/12 11:05, Lentes, Bernd wrote:

Hi,

i have problems with a configured alert which does not alert anymore.
I played a bit around with it and changed several times the configuration with 
cibadmin.
Sometimes i had trouble with the admin_epoch, sometimes with the scheme.
When i invoke now a "crm configure show", at the end i see:

...
rsc_defaults rsc-options: \
 resource-stickiness=200
xml  \
\
  \
\
\
  \
\
  \
\




It seems that crmsh has difficulty parsing the "random" ids of the 
attribute sets here. I guess `crm configure edit` the part to be 
something like:


alert smtp_alert "/root/skripte/alert_smtp.sh" \
attributes email_sender="bernd.len...@helmholtz-muenchen.de" \
to "informatic@helmholtz-muenchen.de" meta 
timestamp-format="%D %H:%M"


will do.

Regards,
  Yan




Is that normal ?


Bernd
  



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Q: List resources affected by utilization limits

2021-01-13 Thread Gao,Yan

On 1/13/21 9:14 AM, Ulrich Windl wrote:

Hi!

I had made a test: I had configured RAM requirements for some test VMs together 
with node RAM capacities. Things were running fine.
Then as a test I reduced the RAM capacity of all nodes, and test VMs were 
stopped due to not enough RAM.
Now I wonder: is there a command that can list those resources that couldn't start 
because of "not enough nod capacity"?
Preferrably combined with the utilization attribute that could not be fulfilled?


crm_simulate -LU should give some hints.

Regards,
  Yan




Regards,
Ulrich



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] "crm verify": ".. stonith-watchdog-timeout is nonzero"

2020-11-26 Thread Gao,Yan

On 11/26/20 8:31 AM, Ulrich Windl wrote:

Hi!

Using SBD, I got this message from crm's top-level "verify":
crm(live/h16)# verify
Current cluster status:
Online: [ h16 h18 h19 ]

  prm_stonith_sbd(stonith:external/sbd):  Started h18
(unpack_config) notice: Watchdog will be used via SBD if fencing is 
required and stonith-watchdog-timeout is nonzero


The message simply tells us "what if stonith-watchdog-timeout is 
nonzero". The message is kind of confusing to constantly appear. It has 
been dropped to info level and relevant documentation has been improved 
as of:


https://github.com/ClusterLabs/pacemaker/pull/2142

Regards,
  Yan




Interestingly this message does not change even after this:
crm(live/h16)configure# property stonith-watchdog-timeout=0
crm(live/h16)configure# verify
crm(live/h16)configure# commit

So what's going on? Most notably what's the difference between configure's 
verify and the top-level verify?

I have this:

in  

Regards,
Ulrich



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Re: Coming in Pacemaker 2.0.4: fencing delay based on what resources are where

2020-03-23 Thread Gao,Yan

On 2020/3/23 14:04, Gao,Yan wrote:


On 2020/3/23 8:00, Ulrich Windl wrote:
Andrei Borzenkov  schrieb am 21.03.2020 um 
18:22 in

Nachricht
<14318_1584811393_5E764D80_14318_174_1_6ab730d7-8cf0-2c7d-7ae5-8d0ea8402758@gmai 


.com>:

21.03.2020 20:07, Ken Gaillot пишет:

Hi all,

I am happy to announce a feature that was discussed on this list a
while back. It will be in Pacemaker 2.0.4 (the first release candidate
is expected in about three weeks).

A longstanding concern in two-node clusters is that in a split brain,
one side must get a fencing delay to avoid simultaneous fencing of both
nodes, but there is no perfect way to determine which node gets the
delay.

The most common approach is to configure a static delay on one node.
This is particularly useful in an active/passive setup where one
particular node is normally assigned the active role.

Another approach is to use the relatively new fence_heuristics_ping
agent in a topology with your real fencing agent. A node that can ping
a configured IP will be more likely to survive.

In addition, we now have a new cluster-wide property, priority-fencing-
delay, that bases the delay on what resources were known to be active
where just before the split. If you set the new property, and configure
priorities for your resources, the node with the highest combined
priority of all resources running on it will be more likely to survive.

As an example, if you set a default priority of 1 for all resources,
and set priority-fencing-delay to 15s, then the node running the most
resources will be more likely to survive because the other node will
wait 15 seconds before initiating fencing. If a particular resource is
more important than the rest, you can give it a higher priority.



That sounds good except one consideration. "priority" also affects
resource placement, and changing it may have rather unexpected results,
especially in cases when scores are carefully selected to achieve
resource distribution.


I've always seen pririties as "super odering" constraints: Try to run the
important resources first (what ever their dependencies or scores are).


The fact about priority is, on "calculation", what resources scheduler 
should "consider" first, so that in cases where there are conflicting 
colocation/anti-colocation constraints, 


I mean conflicting situations in regard of colocation/anti-colocation 
constraints.


Regards,
  Yan


lack of utilization capacity, 
the resources with higher priority will get "decided" first.


So does it affect the order of the resources listed from the output of 
crm_mon? Yes. But it doesn't reflect in what order cluster transitions 
actually start the resources. That's what order constraints are for.


Regards,
   Yan



[...]

Regards,
Ulrich

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Re: Coming in Pacemaker 2.0.4: fencing delay based on what resources are where

2020-03-23 Thread Gao,Yan


On 2020/3/23 8:00, Ulrich Windl wrote:

Andrei Borzenkov  schrieb am 21.03.2020 um 18:22 in

Nachricht
<14318_1584811393_5E764D80_14318_174_1_6ab730d7-8cf0-2c7d-7ae5-8d0ea8402758@gmai
.com>:

21.03.2020 20:07, Ken Gaillot пишет:

Hi all,

I am happy to announce a feature that was discussed on this list a
while back. It will be in Pacemaker 2.0.4 (the first release candidate
is expected in about three weeks).

A longstanding concern in two-node clusters is that in a split brain,
one side must get a fencing delay to avoid simultaneous fencing of both
nodes, but there is no perfect way to determine which node gets the
delay.

The most common approach is to configure a static delay on one node.
This is particularly useful in an active/passive setup where one
particular node is normally assigned the active role.

Another approach is to use the relatively new fence_heuristics_ping
agent in a topology with your real fencing agent. A node that can ping
a configured IP will be more likely to survive.

In addition, we now have a new cluster-wide property, priority-fencing-
delay, that bases the delay on what resources were known to be active
where just before the split. If you set the new property, and configure
priorities for your resources, the node with the highest combined
priority of all resources running on it will be more likely to survive.

As an example, if you set a default priority of 1 for all resources,
and set priority-fencing-delay to 15s, then the node running the most
resources will be more likely to survive because the other node will
wait 15 seconds before initiating fencing. If a particular resource is
more important than the rest, you can give it a higher priority.



That sounds good except one consideration. "priority" also affects
resource placement, and changing it may have rather unexpected results,
especially in cases when scores are carefully selected to achieve
resource distribution.


I've always seen pririties as "super odering" constraints: Try to run the
important resources first (what ever their dependencies or scores are).


The fact about priority is, on "calculation", what resources scheduler 
should "consider" first, so that in cases where there are conflicting 
colocation/anti-colocation constraints, lack of utilization capacity, 
the resources with higher priority will get "decided" first.


So does it affect the order of the resources listed from the output of 
crm_mon? Yes. But it doesn't reflect in what order cluster transitions 
actually start the resources. That's what order constraints are for.


Regards,
  Yan



[...]

Regards,
Ulrich

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Coming in Pacemaker 2.0.4: fencing delay based on what resources are where

2020-03-23 Thread Gao,Yan

On 2020/3/23 7:57, Ulrich Windl wrote:

Ken Gaillot  schrieb am 21.03.2020 um 18:07 in

Nachricht
<15250_1584810570_5E764A4A_15250_638_1_c8c5a180d8ad9327dfd1e743d4352556f3fd.
a...@redhat.com>:

Hi all,

I am happy to announce a feature that was discussed on this list a
while back. It will be in Pacemaker 2.0.4 (the first release candidate
is expected in about three weeks).

A longstanding concern in two-node clusters is that in a split brain,
one side must get a fencing delay to avoid simultaneous fencing of both
nodes, but there is no perfect way to determine which node gets the
delay.

The most common approach is to configure a static delay on one node.
This is particularly useful in an active/passive setup where one
particular node is normally assigned the active role.


Actually with sbd there could be a more simplitic approach: Allocate a a
pseudo-mode named "DC" or "locker" and then use a SCSI lock mechanism to update
that slot atomically. Only the node that "has the lock" may issue fence
commands. Once the fencing is confirmed, the locker slot is released
(wiped)...


It doesn't sound as simple as directly introducing a delay. What if the 
lock holder itself somehow runs into issue or dies after the fencing is 
issued but before it's confirmed? So the other node would have to 
somehow gain the lock after,well, a "delay" anyway?






Another approach is to use the relatively new fence_heuristics_ping
agent in a topology with your real fencing agent. A node that can ping
a configured IP will be more likely to survive.

In addition, we now have a new cluster-wide property, priority-fencing-
delay, that bases the delay on what resources were known to be active
where just before the split. If you set the new property, and configure
priorities for your resources, the node with the highest combined
priority of all resources running on it will be more likely to survive.


Or combined with a ping-like mechanism: Each node periodically sends an "I'm
alive" message that updates the node's timestamp in CIB status. The node that
was alive last will survive. If it doesn't react within fencing timeout, the
second-newest (in case of two nodes: the other) node may fence and try to form
a cluster.


Why would such an outdated node state matter more than what corosync 
tells us? And the point here is not pick "A" node. The point is pick the 
more "significant" node which is potentially hosting the more 
significant resources/instances to help it win inevitable fencing match 
in case it's split-brain.


Regards,
  Yan





As an example, if you set a default priority of 1 for all resources,
and set priority-fencing-delay to 15s, then the node running the most
resources will be more likely to survive because the other node will
wait 15 seconds before initiating fencing. If a particular resource is
more important than the rest, you can give it a higher priority.

The master role of promotable clones will get an extra 1 point, if a
priority has been configured for that clone.

If both nodes have equal priority, or fencing is needed for some reason
other than node loss (e.g. on-fail=fencing for some monitor), then the
usual delay properties apply (pcmk_delay_base, etc.).

I'd like to recognize the primary authors of the 2.0.4 features
announced so far:
- shutdown locks: myself
- switch to clock_gettime() for monotonic clock: Jan Pokorný
- crm_mon --include/--exclude: Chris Lumens
- priority-fencing-delay: Gao,Yan
--
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Coming in Pacemaker 2.0.4: fencing delay based on what resources are where

2020-03-22 Thread Gao,Yan

On 2020/3/21 18:22, Andrei Borzenkov wrote:

21.03.2020 20:07, Ken Gaillot пишет:

Hi all,

I am happy to announce a feature that was discussed on this list a
while back. It will be in Pacemaker 2.0.4 (the first release candidate
is expected in about three weeks).

A longstanding concern in two-node clusters is that in a split brain,
one side must get a fencing delay to avoid simultaneous fencing of both
nodes, but there is no perfect way to determine which node gets the
delay.

The most common approach is to configure a static delay on one node.
This is particularly useful in an active/passive setup where one
particular node is normally assigned the active role.

Another approach is to use the relatively new fence_heuristics_ping
agent in a topology with your real fencing agent. A node that can ping
a configured IP will be more likely to survive.

In addition, we now have a new cluster-wide property, priority-fencing-
delay, that bases the delay on what resources were known to be active
where just before the split. If you set the new property, and configure
priorities for your resources, the node with the highest combined
priority of all resources running on it will be more likely to survive.

As an example, if you set a default priority of 1 for all resources,
and set priority-fencing-delay to 15s, then the node running the most
resources will be more likely to survive because the other node will
wait 15 seconds before initiating fencing. If a particular resource is
more important than the rest, you can give it a higher priority.



That sounds good except one consideration. "priority" also affects
resource placement, and changing it may have rather unexpected results,
especially in cases when scores are carefully selected to achieve
resource distribution.


Despite the fact that resource locations and  placement-strategy are 
more for that purpose, it's true that resource priority now could imply 
more things, which users might want to think through for either new 
deployments or existing ones.


Thanks for the thorough introduction, Ken.

Regards,
  Yan




The master role of promotable clones will get an extra 1 point, if a
priority has been configured for that clone.

If both nodes have equal priority, or fencing is needed for some reason
other than node loss (e.g. on-fail=fencing for some monitor), then the
usual delay properties apply (pcmk_delay_base, etc.).

I'd like to recognize the primary authors of the 2.0.4 features
announced so far:
- shutdown locks: myself
- switch to clock_gettime() for monotonic clock: Jan Pokorný
- crm_mon --include/--exclude: Chris Lumens
- priority-fencing-delay: Gao,Yan



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] SuSE12SP3 HAE SBD Communication Issue

2019-02-12 Thread Gao,Yan

On 2/11/19 9:49 AM, Fulong Wang wrote:

Thanks Yan,

You gave me more valuable hints on the SBD operation!
Now, i can see the verbose output after service restart.



Be  aware since pacemaker integration (-P) is enabled by default, which
means  despite the sbd failure, if the node itself was clean and
"healthy"  from pacemaker's point of view and if it's in the cluster
partition  with the quorum, it wouldn't self-fence -- meaning a node just
being  unable to fence doesn't necessarily need to be fenced.



As  described in sbd man page, "this allows sbd to survive temporary
outages  of the majority of devices. However, while the cluster is in
such  a degraded state, it can neither successfully fence nor be shutdown
cleanly  (as taking the cluster below the quorum threshold will
immediately  cause all remaining nodes to self-fence). In short, it will
not  tolerate any further faults.  Please repair the system before
continuing."


Yes, I can see the "pacemaker integration" was enabled in my sbd config 
file by default.
So, you mean in some sbd failure cases, if the node was considered as 
"healthy" from pacemaker's poinit of view, it still wouldn't sel-fence.


Honestly speaking, i didn't get you at this point. I have 
"no-quorum-policy=ignore" setting in my setup and it's a two node cluster.
Not directly related to the behaviors of sbd, starting from corosync-2, 
with properly configured "quorum" service in corosync.conf, 
no-quorum-policy=ignore in pacemaker should be avoided, meaning 
pacemaker should follow the decisions on quorum made by corosync:


https://www.suse.com/documentation/sle-ha-12/book_sleha/data/sec_ha_config_basics_global.html#sec_ha_config_basics_corosync_2-node



Can you show me a sample situation for this?
For example if a node loses access to the sbd device, but every node is 
still "clean" online, meaning there's no need to fence anyone at the 
point. The node will continue functioning under such a degraded state. 
But of course administrator needs to fix the sbd issue as soon as possible.


Be aware that 2-node cluster is such a common but special use case. If 
we lose one node meanwhile also lose the access to sbd, the single 
online node will self-fence even if corosync's votequorum service 
considers it as being "quorate". This is the safest approach for good in 
case it's split-brain. This already works correctly with the fix in 
regard of 2-node cluster from Klaus.


Regards,
  Yan


Many Thanks!!!




Reagards
Fulong



--------
*From:* Gao,Yan 
*Sent:* Thursday, January 3, 2019 20:43
*To:* Fulong Wang; Cluster Labs - All topics related to open-source 
clustering welcomed

*Subject:* Re: [ClusterLabs] SuSE12SP3 HAE SBD Communication Issue
On 12/24/18 7:10 AM, Fulong Wang wrote:

Yan, klaus and Everyone,


   Merry Christmas!!!



Many thanks for your advice!
I added the "-v" param in "SBD_OPTS", but didn't see any apparent change 
in the system message log,  am i looking at a wrong place?

Did you restart all cluster services, for example by "crm cluster stop"
and then "crm cluster start"? Basically sbd.service needs to be
restarted. Be aware "systemctl restart pacemaker" only restarts pacemaker.

SBD daemons log into syslog. When a sbd watcher receives a "test"
command, there should be a syslog like this showing up:

"servant: Received command test from ..."

sbd won't actually do anything about a "test" command but logging a message.

If you are not running a late version of sbd (maintenance update) yet, a
single "-v" will make sbd too verbose already. But of course you could
use grep.



By the way, we want to test when the disk access paths (multipath 
devices) lost, the sbd can fence the node automatically.

Be aware since pacemaker integration (-P) is enabled by default, which
means despite the sbd failure, if the node itself was clean and
"healthy" from pacemaker's point of view and if it's in the cluster
partition with the quorum, it wouldn't self-fence -- meaning a node just
being unable to fence doesn't necessarily need to be fenced.

As described in sbd man page, "this allows sbd to survive temporary
outages of the majority of devices. However, while the cluster is in
such a degraded state, it can neither successfully fence nor be shutdown
cleanly (as taking the cluster below the quorum threshold will
immediately cause all remaining nodes to self-fence). In short, it will
not tolerate any further faults.  Please repair the system before
continuing."

Regards,
    Yan



what's your recommendation for this scenario?







The "crm node fence"  did the work.













Regards
Fulong


*From:* Gao,Yan 
*Sent:* Friday, Decembe

Re: [ClusterLabs] SuSE12SP3 HAE SBD Communication Issue

2019-02-12 Thread Gao,Yan

On 2/12/19 3:38 AM, Fulong Wang wrote:

Klaus,

Thanks for the infor!
Did you mean i should compile sbd from github source to include the fixs 
you mentioned by myself?


The corosync, pacemaker and sbd version in my setup is as below:
corosync:     2.3.6-9.13.1
pacemaker: 1.1.16-6.5.1
sbd:               1.3.1+20180507
I'm pretty sure this version has the fix in regard of 2-node cluster 
from Klaus.


Regards,
  Yan





Regards
Fulong

*From:* Klaus Wenninger 
*Sent:* Monday, February 11, 2019 18:51
*To:* Cluster Labs - All topics related to open-source clustering 
welcomed; Fulong Wang; Gao,Yan

*Subject:* Re: [ClusterLabs] SuSE12SP3 HAE SBD Communication Issue
On 02/11/2019 09:49 AM, Fulong Wang wrote:

Thanks Yan,

You gave me more valuable hints on the SBD operation!
Now, i can see the verbose output after service restart.


>Be aware since pacemaker integration (-P) is enabled by default, which
>means despite the sbd failure, if the node itself was clean and
>"healthy" from pacemaker's point of view and if it's in the cluster
>partition with the quorum, it wouldn't self-fence -- meaning a node just
>being unable to fence doesn't necessarily need to be fenced.

>As described in sbd man page, "this allows sbd to survive temporary
>outages of the majority of devices. However, while the cluster is in
>such a degraded state, it can neither successfully fence nor be shutdown
>cleanly (as taking the cluster below the quorum threshold will
>immediately cause all remaining nodes to self-fence). In short, it will
>not tolerate any further faults.  Please repair the system before
>continuing."

Yes, I can see the "pacemaker integration" was enabled in my sbd 
config file by default.
So, you mean in some sbd failure cases, if the node was considered as 
"healthy" from pacemaker's poinit of view, it still wouldn't sel-fence.


Honestly speaking, i didn't get you at this point. I have 
"no-quorum-policy=ignore" setting in my setup and it's a two node 
cluster.

Can you show me a sample situation for this?


When using sbd with 2-node-clusters and pacemaker-integration you might
check 
https://github.com/ClusterLabs/sbd/commit/4bd0a66da3ac9c9afaeb8a2468cdd3ed51ad3377

to be included in your sbd-version.
This is relevant when 2-node is configured in corosync.

Regards,
Klaus



Many Thanks!!!




Reagards
Fulong



----
*From:* Gao,Yan  <mailto:y...@suse.com>
*Sent:* Thursday, January 3, 2019 20:43
*To:* Fulong Wang; Cluster Labs - All topics related to open-source 
clustering welcomed

*Subject:* Re: [ClusterLabs] SuSE12SP3 HAE SBD Communication Issue
On 12/24/18 7:10 AM, Fulong Wang wrote:
> Yan, klaus and Everyone,
> 
> 
>   Merry Christmas!!!
> 
> 
> 
> Many thanks for your advice!
> I added the "-v" param in "SBD_OPTS", but didn't see any apparent change 
> in the system message log,  am i looking at a wrong place?

Did you restart all cluster services, for example by "crm cluster stop"
and then "crm cluster start"? Basically sbd.service needs to be
restarted. Be aware "systemctl restart pacemaker" only restarts pacemaker.

SBD daemons log into syslog. When a sbd watcher receives a "test"
command, there should be a syslog like this showing up:

"servant: Received command test from ..."

sbd won't actually do anything about a "test" command but logging a 
message.


If you are not running a late version of sbd (maintenance update) yet, a
single "-v" will make sbd too verbose already. But of course you could
use grep.

> 
> By the way, we want to test when the disk access paths (multipath 
> devices) lost, the sbd can fence the node automatically.

Be aware since pacemaker integration (-P) is enabled by default, which
means despite the sbd failure, if the node itself was clean and
"healthy" from pacemaker's point of view and if it's in the cluster
partition with the quorum, it wouldn't self-fence -- meaning a node just
being unable to fence doesn't necessarily need to be fenced.

As described in sbd man page, "this allows sbd to survive temporary
outages of the majority of devices. However, while the cluster is in
such a degraded state, it can neither successfully fence nor be shutdown
cleanly (as taking the cluster below the quorum threshold will
immediately cause all remaining nodes to self-fence). In short, it will
not tolerate any further faults.  Please repair the system before
continuing."

Regards,
   Yan


> what's your recommendation for this scenario?
> 
> 
> 
> 
> 
> 
> 
> The "crm node fence"  did the work.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>

Re: [ClusterLabs] SuSE12SP3 HAE SBD Communication Issue

2019-01-03 Thread Gao,Yan

On 12/22/18 5:27 AM, Andrei Borzenkov wrote:

21.12.2018 12:09, Klaus Wenninger пишет:

On 12/21/2018 08:15 AM, Fulong Wang wrote:

Hello Experts,

I'm New to this mail lists.
Pls kindlyforgive me if this mail has disturb you!

Our Company recently is evaluating the usage of the SuSE HAE on x86
platform.
Wen simulating the storage disaster fail-over, i finally found that
the SBD communication functioned normal on SuSE11 SP4 but abnormal on
SuSE12 SP3.


I have no experience with SBD on SLES but I know that handling of the
logging verbosity-levels has changed recently in the upstream-repo.
Given that it was done by Yan Gao iirc I'd assume it went into SLES.
So changing the verbosity of the sbd-daemon might get you back
these logs.


Do you mean

commit 2dbdee29736fcbf0fe1d41c306959b22d05f72b0
Author: Gao,Yan 
Date:   Mon Apr 30 18:02:04 2018 +0200

 Log: upgrade important messages and downgrade unimportant ones

?? This commit actually increased severity for message on target node:

@@ -1180,7 +1180,7 @@ int servant(const char *diskname, int mode, const
void* argp)
 }

 if (s_mbox->cmd > 0) {
-   cl_log(LOG_INFO,
+   cl_log(LOG_NOTICE,
"Received command %s from %s on disk %s",
char2cmd(s_mbox->cmd), s_mbox->from,
diskname);

and did not change severity for messages on source node (they are still
INFO).
True. Not sure if any of them should belong to notice if everything 
works well... sbd commands that send messages can be supplied with -v as 
well of course.


Regards,
  Yan





And of course you can use the list command on the other node
to verify as well.

Klaus


The SBD device was added during the initialization of the first
cluster node.

I have requested help from SuSE guys, but they didn't give me any
valuable feedback yet now!


Below are some screenshots to explain what i have encountered.
~~~

on a SuSE11 SP4 HAE cluster,  i  run the sbd test command as below:


then there will be some information showed up in the local system
message log



on the second node,  we can found that the communication is normal by



but when i turn to a SuSE12 SP3 HAE cluster,  ran the same command as
above:



I didn't get any  response in the system message log.


"systemctl status sbd" also doesn't give me any clue on this.



~~

What could be the reason for this abnormal behavior?  Is there any
problems with my setup?
Any suggestions are appreciate!

Thanks!


Regards
FuLong


___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] SuSE12SP3 HAE SBD Communication Issue

2019-01-03 Thread Gao,Yan

On 12/24/18 7:10 AM, Fulong Wang wrote:

Yan, klaus and Everyone,


  Merry Christmas!!!



Many thanks for your advice!
I added the "-v" param in "SBD_OPTS", but didn't see any apparent change 
in the system message log,  am i looking at a wrong place?
Did you restart all cluster services, for example by "crm cluster stop" 
and then "crm cluster start"? Basically sbd.service needs to be 
restarted. Be aware "systemctl restart pacemaker" only restarts pacemaker.


SBD daemons log into syslog. When a sbd watcher receives a "test" 
command, there should be a syslog like this showing up:


"servant: Received command test from ..."

sbd won't actually do anything about a "test" command but logging a message.

If you are not running a late version of sbd (maintenance update) yet, a 
single "-v" will make sbd too verbose already. But of course you could 
use grep.




By the way, we want to test when the disk access paths (multipath 
devices) lost, the sbd can fence the node automatically.
Be aware since pacemaker integration (-P) is enabled by default, which 
means despite the sbd failure, if the node itself was clean and 
"healthy" from pacemaker's point of view and if it's in the cluster 
partition with the quorum, it wouldn't self-fence -- meaning a node just 
being unable to fence doesn't necessarily need to be fenced.


As described in sbd man page, "this allows sbd to survive temporary 
outages of the majority of devices. However, while the cluster is in 
such a degraded state, it can neither successfully fence nor be shutdown 
cleanly (as taking the cluster below the quorum threshold will 
immediately cause all remaining nodes to self-fence). In short, it will 
not tolerate any further faults.  Please repair the system before 
continuing."


Regards,
  Yan



what's your recommendation for this scenario?







The "crm node fence"  did the work.













Regards
Fulong


*From:* Gao,Yan 
*Sent:* Friday, December 21, 2018 20:43
*To:* kwenn...@redhat.com; Cluster Labs - All topics related to 
open-source clustering welcomed; Fulong Wang

*Subject:* Re: [ClusterLabs] SuSE12SP3 HAE SBD Communication Issue
First thanks for your reply, Klaus!

On 2018/12/21 10:09, Klaus Wenninger wrote:

On 12/21/2018 08:15 AM, Fulong Wang wrote:

Hello Experts,

I'm New to this mail lists.
Pls kindlyforgive me if this mail has disturb you!

Our Company recently is evaluating the usage of the SuSE HAE on x86 
platform.
Wen simulating the storage disaster fail-over, i finally found that 
the SBD communication functioned normal on SuSE11 SP4 but abnormal on 
SuSE12 SP3.


I have no experience with SBD on SLES but I know that handling of the
logging verbosity-levels has changed recently in the upstream-repo.
Given that it was done by Yan Gao iirc I'd assume it went into SLES.
So changing the verbosity of the sbd-daemon might get you back
these logs.

Yes, I think it's the issue. Could you please retrieve the latest
maintenance update for SLE12SP3 and try? Otherwise of course you could
temporarily enable verbose/debug logging by adding a couple of "-v" into
   "SBD_OPTS" in /etc/sysconfig/sbd.

But frankly, it makes more sense to manually trigger fencing for example
by "crm node fence" and see if it indeed works correctly.


And of course you can use the list command on the other node
to verify as well.

The "test" message in the slot might get overwritten soon by a "clear"
if the sbd daemon is running.

Regards,
    Yan




Klaus

The SBD device was added during the initialization of the first 
cluster node.


I have requested help from SuSE guys, but they didn't give me any 
valuable feedback yet now!



Below are some screenshots to explain what i have encountered.
~~~

on a SuSE11 SP4 HAE cluster,  i  run the sbd test command as below:


then there will be some information showed up in the local system 
message log




on the second node,  we can found that the communication is normal by



but when i turn to a SuSE12 SP3 HAE cluster,  ran the same command as 
above:




I didn't get any  response in the system message log.


"systemctl status sbd" also doesn't give me any clue on this.



~~

What could be the reason for this abnormal behavior?  Is there any 
problems with my setup?

Any suggestions are appreciate!

Thanks!


Regards
FuLong


___
Users mailing list:Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home:http://www.cluste

Re: [ClusterLabs] SuSE12SP3 HAE SBD Communication Issue

2018-12-21 Thread Gao,Yan

First thanks for your reply, Klaus!

On 2018/12/21 10:09, Klaus Wenninger wrote:

On 12/21/2018 08:15 AM, Fulong Wang wrote:

Hello Experts,

I'm New to this mail lists.
Pls kindlyforgive me if this mail has disturb you!

Our Company recently is evaluating the usage of the SuSE HAE on x86 
platform.
Wen simulating the storage disaster fail-over, i finally found that 
the SBD communication functioned normal on SuSE11 SP4 but abnormal on 
SuSE12 SP3.


I have no experience with SBD on SLES but I know that handling of the
logging verbosity-levels has changed recently in the upstream-repo.
Given that it was done by Yan Gao iirc I'd assume it went into SLES.
So changing the verbosity of the sbd-daemon might get you back
these logs.
Yes, I think it's the issue. Could you please retrieve the latest 
maintenance update for SLE12SP3 and try? Otherwise of course you could 
temporarily enable verbose/debug logging by adding a couple of "-v" into 
 "SBD_OPTS" in /etc/sysconfig/sbd.


But frankly, it makes more sense to manually trigger fencing for example 
by "crm node fence" and see if it indeed works correctly.



And of course you can use the list command on the other node
to verify as well.
The "test" message in the slot might get overwritten soon by a "clear" 
if the sbd daemon is running.


Regards,
  Yan




Klaus

The SBD device was added during the initialization of the first 
cluster node.


I have requested help from SuSE guys, but they didn't give me any 
valuable feedback yet now!



Below are some screenshots to explain what i have encountered.
~~~

on a SuSE11 SP4 HAE cluster,  i  run the sbd test command as below:


then there will be some information showed up in the local system 
message log




on the second node,  we can found that the communication is normal by



but when i turn to a SuSE12 SP3 HAE cluster,  ran the same command as 
above:




I didn't get any  response in the system message log.


"systemctl status sbd" also doesn't give me any clue on this.



~~

What could be the reason for this abnormal behavior?  Is there any 
problems with my setup?

Any suggestions are appreciate!

Thanks!


Regards
FuLong


___
Users mailing list:Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home:http://www.clusterlabs.org
Getting started:http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs:http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Wrong sbd.service dependencies

2017-12-17 Thread Gao,Yan

On 2017/12/16 16:59, Andrei Borzenkov wrote:

04.12.2017 21:55, Andrei Borzenkov пишет:
...


I tried it (on openSUSE Tumbleweed which is what I have at hand, it has
SBD 1.3.0) and with SBD_DELAY_START=yes sbd does not appear to watch
disk at all.

It simply waits that long on startup before starting the rest of the
cluster stack to make sure the fencing that targeted it has returned. It
intentionally doesn't watch anything during this period of time.



Unfortunately it waits too long.

ha1:~ # systemctl status sbd.service
● sbd.service - Shared-storage based fencing daemon
Loaded: loaded (/usr/lib/systemd/system/sbd.service; enabled; vendor
preset: disabled)
Active: failed (Result: timeout) since Mon 2017-12-04 21:47:03 MSK;
4min 16s ago
   Process: 1861 ExecStop=/usr/bin/kill -TERM $MAINPID (code=exited,
status=0/SUCCESS)
   Process: 2058 ExecStart=/usr/sbin/sbd $SBD_OPTS -p /var/run/sbd.pid
watch (code=killed, signa
  Main PID: 1792 (code=exited, status=0/SUCCESS)

дек 04 21:45:32 ha1 systemd[1]: Starting Shared-storage based fencing
daemon...
дек 04 21:47:02 ha1 systemd[1]: sbd.service: Start operation timed out.
Terminating.
дек 04 21:47:03 ha1 systemd[1]: Failed to start Shared-storage based
fencing daemon.
дек 04 21:47:03 ha1 systemd[1]: sbd.service: Unit entered failed state.
дек 04 21:47:03 ha1 systemd[1]: sbd.service: Failed with result 'timeout'.

But the real problem is - in spite of SBD failed to start, the whole
cluster stack continues to run; and because SBD blindly trusts in well
behaving nodes, fencing appears to succeed after timeout ... without
anyone taking any action on poison pill ...



That's sbd bug. It declares itself as RequiredBy=corosync.service but
puts itself Before=pacemaker.service. Due to systemd design, service A
*MUST* have Before dependency on service B if failure to start A should
cause failure to start B. *Or* use BindsTo ... but that sounds wrong
because it would cause B to start briefly and then be killed.

So the question is what is intended here. Should sbd.service be
prerequisite for corosync or pacemaker? 

It should be so only if it's enabled. Try this:
https://github.com/ClusterLabs/sbd/pull/39

Thanks to Klaus, btw.

Regards,
  Yan


Should failure to start SBD be
fatal for startup of dependent service? Finally does sbd need explicit
dependency on pacemaker.service at all (in addition to corosync.service)?

Adding Before dependency fixes startup logic for me.

ha1:~ # systemctl start pacemaker.service
A dependency job for pacemaker.service failed. See 'journalctl -xe' for
details.
ha1:~ # systemctl -l --no-pager status pacemaker.service
● pacemaker.service - Pacemaker High Availability Cluster Manager
Loaded: loaded (/etc/systemd/system/pacemaker.service; disabled;
vendor preset: disabled)
Active: inactive (dead)
  Docs: man:pacemakerd

http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Explained/index.html

дек 16 18:56:06 ha1 systemd[1]: Dependency failed for Pacemaker High
Availability Cluster Manager.
дек 16 18:56:06 ha1 systemd[1]: pacemaker.service: Job
pacemaker.service/start failed with result 'dependency'.
ha1:~ # systemctl -l --no-pager status corosync.service
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/usr/lib/systemd/system/corosync.service; static;
vendor preset: disabled)
Active: inactive (dead)
  Docs: man:corosync
man:corosync.conf
man:corosync_overview

дек 16 18:56:06 ha1 systemd[1]: Dependency failed for Corosync Cluster
Engine.
дек 16 18:56:06 ha1 systemd[1]: corosync.service: Job
corosync.service/start failed with result 'dependency'.
ha1:~ # systemctl -l --no-pager status sbd.service
● sbd.service - Shared-storage based fencing daemon
Loaded: loaded (/usr/lib/systemd/system/sbd.service; enabled; vendor
preset: disabled)
   Drop-In: /etc/systemd/system/sbd.service.d
└─before-corosync.conf
Active: failed (Result: timeout) since Sat 2017-12-16 18:56:06 MSK;
50s ago
   Process: 3675 ExecStart=/usr/sbin/sbd $SBD_OPTS -p /var/run/sbd.pid
watch (code=killed, signal=TERM)

дек 16 18:54:36 ha1 systemd[1]: Starting Shared-storage based fencing
daemon...
дек 16 18:56:06 ha1 systemd[1]: sbd.service: Start operation timed out.
Terminating.
дек 16 18:56:06 ha1 systemd[1]: Failed to start Shared-storage based
fencing daemon.
дек 16 18:56:06 ha1 systemd[1]: sbd.service: Unit entered failed state.
дек 16 18:56:06 ha1 systemd[1]: sbd.service: Failed with result 'timeout'.
ha1:~ # cat /etc/systemd/system/sbd.service.d/before-corosync.conf
[Unit]
Before=corosync.service
ha1:~ #

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: 

Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: Re: pacemaker with sbd fails to start if node reboots too fast.

2017-12-05 Thread Gao,Yan

On 12/05/2017 03:11 PM, Ulrich Windl wrote:




"Gao,Yan" <y...@suse.com> schrieb am 05.12.2017 um 15:04 in Nachricht

<f3433dca-d654-0eac-80d6-2f92aeb3e...@suse.com>:

On 12/05/2017 12:41 PM, Ulrich Windl wrote:




"Gao,Yan" <y...@suse.com> schrieb am 01.12.2017 um 20:36 in Nachricht

<e49f3c0a-6981-3ab4-a0b0-1e5f49f34...@suse.com>:


[...]


I meant: There are three delays:
1) The delay until data is on the disk

It takes several IOs for the sender to do this -- read the device
header, lookup the slot, write the message and verify the message is
written (-- A timeout_io defaults to 3s).

As mentioned, msgwait timer of the sender starts only after message has
been verified to be written. We just need to make sure stonith-timeout
is configured longer enough than the sum.


2) Delay until date is read from the disk

It's already taken into account with msgwait. Considering the recipient
keeps reading in a loop, we don't know when exactly it starts to read
for this specific message. But once it starts a reading, it has to be
done within timeout_watchdog, otherwise watchdog triggers. So even for a
bad case, the message should be read within 2* timemout_watchdog. That's
the reason why the sender has to wait msgwait, which is 2 *
timeout_watchdog.


3) Delay until Host was killed

Kill is basically immediately triggered once poison pill is read.


Considering that the response time of a SAN disk system with cache is typically a very 
few microseconds, writing to disk may be even "more immediate" than killing the 
node via watchdog reset ;-)
Well, it's possible :) Timeout matters for "bad cases" though. Compared 
with a disk io facing difficulties like path failure and so on, 
triggering watchdog is trivial.



So you can't easily say one is immediate, while the other has to be waited for 
IMHO.
Of course a even longer msgwait with all the factors that you can think 
of taken into account will be even safer.


Regards,
  Yan



Regards,
Ulrich




A confirmation before 3) could shorten the total wait that includes 2) and

3),

right?

As mentioned in another email, an alive node, even indeed coming back
from death, cannot actually confirm itself or even give a confirmation
about if it was ever dead. And a successful fencing means the node being
dead.

Regards,
Yan




Regards,
Ulrich




Regards,
 Yan


[...]


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: Antw: Re: pacemaker with sbd fails to start if node reboots too fast.

2017-12-05 Thread Gao,Yan

On 12/05/2017 12:41 PM, Ulrich Windl wrote:




"Gao,Yan" <y...@suse.com> schrieb am 01.12.2017 um 20:36 in Nachricht

<e49f3c0a-6981-3ab4-a0b0-1e5f49f34...@suse.com>:

On 11/30/2017 06:48 PM, Andrei Borzenkov wrote:

30.11.2017 16:11, Klaus Wenninger пишет:

On 11/30/2017 01:41 PM, Ulrich Windl wrote:



"Gao,Yan" <y...@suse.com> schrieb am 30.11.2017 um 11:48 in Nachricht

<e71afccc-06e3-97dd-c66a-1b4bac550...@suse.com>:

On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:

SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
VM on VSphere using shared VMDK as SBD. During basic tests by killing
corosync and forcing STONITH pacemaker was not started after reboot.
In logs I see during boot

Nov 22 16:04:56 sapprod01s crmd[3151]: crit: We were allegedly
just fenced by sapprod01p for sapprod01p
Nov 22 16:04:56 sapprod01s pacemakerd[3137]:  warning: The crmd
process (3151) can no longer be respawned,
Nov 22 16:04:56 sapprod01s pacemakerd[3137]:   notice: Shutting down

Pacemaker

SBD timeouts are 60s for watchdog and 120s for msgwait. It seems that
stonith with SBD always takes msgwait (at least, visually host is not
declared as OFFLINE until 120s passed). But VM rebots lightning fast
and is up and running long before timeout expires.

As msgwait was intended for the message to arrive, and not for the reboot



time (I guess), this just shows a fundamental problem in SBD design: Receipt



of the fencing command is not confirmed (other than by seeing the
consequences of ist execution).


The 2 x msgwait is not for confirmations but for writing the poison-pill
and for
having it read by the target-side.


Yes, of course, but that's not what Urlich likely intended to say.
msgwait must account for worst case storage path latency, while in
normal cases it happens much faster. If fenced node could acknowledge
having been killed after reboot, stonith agent could return success much
earlier.

How could an alive man be sure he died before? ;)


I meant: There are three delays:
1) The delay until data is on the disk
It takes several IOs for the sender to do this -- read the device 
header, lookup the slot, write the message and verify the message is 
written (-- A timeout_io defaults to 3s).


As mentioned, msgwait timer of the sender starts only after message has 
been verified to be written. We just need to make sure stonith-timeout 
is configured longer enough than the sum.



2) Delay until date is read from the disk
It's already taken into account with msgwait. Considering the recipient 
keeps reading in a loop, we don't know when exactly it starts to read 
for this specific message. But once it starts a reading, it has to be 
done within timeout_watchdog, otherwise watchdog triggers. So even for a 
bad case, the message should be read within 2* timemout_watchdog. That's 
the reason why the sender has to wait msgwait, which is 2 * 
timeout_watchdog.



3) Delay until Host was killed

Kill is basically immediately triggered once poison pill is read.


A confirmation before 3) could shorten the total wait that includes 2) and 3),
right?
As mentioned in another email, an alive node, even indeed coming back 
from death, cannot actually confirm itself or even give a confirmation 
about if it was ever dead. And a successful fencing means the node being 
dead.


Regards,
  Yan




Regards,
Ulrich




Regards,
Yan



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] pacemaker with sbd fails to start if node reboots too fast.

2017-12-05 Thread Gao,Yan

On 12/05/2017 08:57 AM, Dejan Muhamedagic wrote:

On Mon, Dec 04, 2017 at 09:55:46PM +0300, Andrei Borzenkov wrote:

04.12.2017 14:48, Gao,Yan пишет:

On 12/02/2017 07:19 PM, Andrei Borzenkov wrote:

30.11.2017 13:48, Gao,Yan пишет:

On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:

SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
VM on VSphere using shared VMDK as SBD. During basic tests by killing
corosync and forcing STONITH pacemaker was not started after reboot.
In logs I see during boot

Nov 22 16:04:56 sapprod01s crmd[3151]: crit: We were allegedly
just fenced by sapprod01p for sapprod01p
Nov 22 16:04:56 sapprod01s pacemakerd[3137]:  warning: The crmd
process (3151) can no longer be respawned,
Nov 22 16:04:56 sapprod01s pacemakerd[3137]:   notice: Shutting down
Pacemaker

SBD timeouts are 60s for watchdog and 120s for msgwait. It seems that
stonith with SBD always takes msgwait (at least, visually host is not
declared as OFFLINE until 120s passed). But VM rebots lightning fast
and is up and running long before timeout expires.

I think I have seen similar report already. Is it something that can
be fixed by SBD/pacemaker tuning?

SBD_DELAY_START=yes in /etc/sysconfig/sbd is the solution.



I tried it (on openSUSE Tumbleweed which is what I have at hand, it has
SBD 1.3.0) and with SBD_DELAY_START=yes sbd does not appear to watch
disk at all.

It simply waits that long on startup before starting the rest of the
cluster stack to make sure the fencing that targeted it has returned. It
intentionally doesn't watch anything during this period of time.



Unfortunately it waits too long.

ha1:~ # systemctl status sbd.service
● sbd.service - Shared-storage based fencing daemon
Loaded: loaded (/usr/lib/systemd/system/sbd.service; enabled; vendor
preset: disabled)
Active: failed (Result: timeout) since Mon 2017-12-04 21:47:03 MSK;
4min 16s ago
   Process: 1861 ExecStop=/usr/bin/kill -TERM $MAINPID (code=exited,
status=0/SUCCESS)
   Process: 2058 ExecStart=/usr/sbin/sbd $SBD_OPTS -p /var/run/sbd.pid
watch (code=killed, signa
  Main PID: 1792 (code=exited, status=0/SUCCESS)

дек 04 21:45:32 ha1 systemd[1]: Starting Shared-storage based fencing
daemon...
дек 04 21:47:02 ha1 systemd[1]: sbd.service: Start operation timed out.
Terminating.
дек 04 21:47:03 ha1 systemd[1]: Failed to start Shared-storage based
fencing daemon.
дек 04 21:47:03 ha1 systemd[1]: sbd.service: Unit entered failed state.
дек 04 21:47:03 ha1 systemd[1]: sbd.service: Failed with result 'timeout'.

But the real problem is - in spite of SBD failed to start, the whole
cluster stack continues to run; and because SBD blindly trusts in well
behaving nodes, fencing appears to succeed after timeout ... without
anyone taking any action on poison pill ...


That's something I always wondered about: if a node is capable of
reading a poison pill then it could before shutdown also write an
"I'm leaving" message into its slot. Wouldn't that make sbd more
reliable? Any reason not to implement that?
Probably it's not considered necessary :) SBD is a fencing mechanism 
which only needs to ensure fencing works. SBD on the fencing target is 
either there eating the pill or getting reset by watchdog, otherwise 
it's not there which is supposed to imply the whole cluster stack is not 
running so that it doesn't need to actually eat the pill.


How systemd should handle the service dependencies is another topic...

Regards,
  Yan





Thanks,

Dejan

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] pacemaker with sbd fails to start if node reboots too fast.

2017-12-05 Thread Gao,Yan

On 12/04/2017 07:55 PM, Andrei Borzenkov wrote:

04.12.2017 14:48, Gao,Yan пишет:

On 12/02/2017 07:19 PM, Andrei Borzenkov wrote:

30.11.2017 13:48, Gao,Yan пишет:

On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:

SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
VM on VSphere using shared VMDK as SBD. During basic tests by killing
corosync and forcing STONITH pacemaker was not started after reboot.
In logs I see during boot

Nov 22 16:04:56 sapprod01s crmd[3151]: crit: We were allegedly
just fenced by sapprod01p for sapprod01p
Nov 22 16:04:56 sapprod01s pacemakerd[3137]:  warning: The crmd
process (3151) can no longer be respawned,
Nov 22 16:04:56 sapprod01s pacemakerd[3137]:   notice: Shutting down
Pacemaker

SBD timeouts are 60s for watchdog and 120s for msgwait. It seems that
stonith with SBD always takes msgwait (at least, visually host is not
declared as OFFLINE until 120s passed). But VM rebots lightning fast
and is up and running long before timeout expires.

I think I have seen similar report already. Is it something that can
be fixed by SBD/pacemaker tuning?

SBD_DELAY_START=yes in /etc/sysconfig/sbd is the solution.



I tried it (on openSUSE Tumbleweed which is what I have at hand, it has
SBD 1.3.0) and with SBD_DELAY_START=yes sbd does not appear to watch
disk at all.

It simply waits that long on startup before starting the rest of the
cluster stack to make sure the fencing that targeted it has returned. It
intentionally doesn't watch anything during this period of time.



Unfortunately it waits too long.

ha1:~ # systemctl status sbd.service
● sbd.service - Shared-storage based fencing daemon
Loaded: loaded (/usr/lib/systemd/system/sbd.service; enabled; vendor
preset: disabled)
Active: failed (Result: timeout) since Mon 2017-12-04 21:47:03 MSK;
4min 16s ago
   Process: 1861 ExecStop=/usr/bin/kill -TERM $MAINPID (code=exited,
status=0/SUCCESS)
   Process: 2058 ExecStart=/usr/sbin/sbd $SBD_OPTS -p /var/run/sbd.pid
watch (code=killed, signa
  Main PID: 1792 (code=exited, status=0/SUCCESS)

дек 04 21:45:32 ha1 systemd[1]: Starting Shared-storage based fencing
daemon...
дек 04 21:47:02 ha1 systemd[1]: sbd.service: Start operation timed out.
Terminating.
дек 04 21:47:03 ha1 systemd[1]: Failed to start Shared-storage based
fencing daemon.
дек 04 21:47:03 ha1 systemd[1]: sbd.service: Unit entered failed state.
дек 04 21:47:03 ha1 systemd[1]: sbd.service: Failed with result 'timeout'.

But the real problem is - in spite of SBD failed to start, the whole
cluster stack continues to run; and because SBD blindly trusts in well
behaving nodes, fencing appears to succeed after timeout ... without
anyone taking any action on poison pill ...
Start of sbd reaches systemd's timeout for starting units and systemd 
proceeds...


TimeoutStartSec should be configured in sbd.service accordingly to be 
longer than msgwait.


Regards,
  Yan




ha1:~ # systemctl show sbd.service -p RequiredBy
RequiredBy=corosync.service

but

ha1:~ # systemctl status corosync.service
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/usr/lib/systemd/system/corosync.service; static;
vendor preset: disabled)
Active: active (running) since Mon 2017-12-04 21:45:33 MSK; 7min ago
  Docs: man:corosync
man:corosync.conf
man:corosync_overview
   Process: 1860 ExecStop=/usr/share/corosync/corosync stop (code=exited,
status=0/SUCCESS)
   Process: 2059 ExecStart=/usr/share/corosync/corosync start
(code=exited, status=0/SUCCESS)
  Main PID: 2073 (corosync)
 Tasks: 2 (limit: 4915)
CGroup: /system.slice/corosync.service
└─2073 corosync

and

ha1:~ # crm_mon -1r
Stack: corosync
Current DC: ha1 (version 1.1.17-3.3-36d2962a8) - partition with quorum
Last updated: Mon Dec  4 21:53:24 2017
Last change: Mon Dec  4 21:47:25 2017 by hacluster via crmd on ha1

2 nodes configured
1 resource configured

Online: [ ha1 ha2 ]

Full list of resources:

  stonith-sbd   (stonith:external/sbd): Started ha1

and if I now sever connection between two nodes I will get two single
node clusters each believing it won ...

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: pacemaker with sbd fails to start if node reboots too fast.

2017-12-04 Thread Gao,Yan

On 12/02/2017 08:30 AM, Andrei Borzenkov wrote:

01.12.2017 22:36, Gao,Yan пишет:

On 11/30/2017 06:48 PM, Andrei Borzenkov wrote:

30.11.2017 16:11, Klaus Wenninger пишет:

On 11/30/2017 01:41 PM, Ulrich Windl wrote:



"Gao,Yan" <y...@suse.com> schrieb am 30.11.2017 um 11:48 in
Nachricht

<e71afccc-06e3-97dd-c66a-1b4bac550...@suse.com>:

On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:

SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
VM on VSphere using shared VMDK as SBD. During basic tests by killing
corosync and forcing STONITH pacemaker was not started after reboot.
In logs I see during boot

Nov 22 16:04:56 sapprod01s crmd[3151]: crit: We were allegedly
just fenced by sapprod01p for sapprod01p
Nov 22 16:04:56 sapprod01s pacemakerd[3137]:  warning: The crmd
process (3151) can no longer be respawned,
Nov 22 16:04:56 sapprod01s pacemakerd[3137]:   notice: Shutting down

Pacemaker

SBD timeouts are 60s for watchdog and 120s for msgwait. It seems that
stonith with SBD always takes msgwait (at least, visually host is not
declared as OFFLINE until 120s passed). But VM rebots lightning fast
and is up and running long before timeout expires.

As msgwait was intended for the message to arrive, and not for the
reboot time (I guess), this just shows a fundamental problem in SBD
design: Receipt of the fencing command is not confirmed (other than
by seeing the consequences of ist execution).


The 2 x msgwait is not for confirmations but for writing the poison-pill
and for
having it read by the target-side.


Yes, of course, but that's not what Urlich likely intended to say.
msgwait must account for worst case storage path latency, while in
normal cases it happens much faster. If fenced node could acknowledge
having been killed after reboot, stonith agent could return success much
earlier.

How could an alive man be sure he died before? ;)



It does not need to. It simply needs to write something on startup to
indicate it is back.
It does that. The thing is the sender cannot just assume that the target 
is ever gone based on that.


And it doesn't make sense that a fencing returns success when the target 
appears to be alive. If the sender kept watching the slot, probably it'd 
make more sense to let fencing return failure and try it again.


Regards,
  Yan



Actually, fenced side already does it - it clears pending message when
sbd is started. It is fencing side that simply unconditionally sleeps
for msgwait:

 if (mbox_write_verify(st, mbox, s_mbox) < -1) {
 rc = -1; goto out;
 }
 if (strcasecmp(cmd, "exit") != 0) {
 cl_log(LOG_INFO, "Messaging delay: %d",
 (int)timeout_msgwait);
 sleep(timeout_msgwait);
 }

What if we do not sleep but rather periodically check slot for
acknowledgement for msgwait timeout? Then we could return earlier.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] pacemaker with sbd fails to start if node reboots too fast.

2017-12-04 Thread Gao,Yan

On 12/02/2017 07:19 PM, Andrei Borzenkov wrote:

30.11.2017 13:48, Gao,Yan пишет:

On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:

SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
VM on VSphere using shared VMDK as SBD. During basic tests by killing
corosync and forcing STONITH pacemaker was not started after reboot.
In logs I see during boot

Nov 22 16:04:56 sapprod01s crmd[3151]: crit: We were allegedly
just fenced by sapprod01p for sapprod01p
Nov 22 16:04:56 sapprod01s pacemakerd[3137]:  warning: The crmd
process (3151) can no longer be respawned,
Nov 22 16:04:56 sapprod01s pacemakerd[3137]:   notice: Shutting down
Pacemaker

SBD timeouts are 60s for watchdog and 120s for msgwait. It seems that
stonith with SBD always takes msgwait (at least, visually host is not
declared as OFFLINE until 120s passed). But VM rebots lightning fast
and is up and running long before timeout expires.

I think I have seen similar report already. Is it something that can
be fixed by SBD/pacemaker tuning?

SBD_DELAY_START=yes in /etc/sysconfig/sbd is the solution.



I tried it (on openSUSE Tumbleweed which is what I have at hand, it has
SBD 1.3.0) and with SBD_DELAY_START=yes sbd does not appear to watch
disk at all. 
It simply waits that long on startup before starting the rest of the 
cluster stack to make sure the fencing that targeted it has returned. It 
intentionally doesn't watch anything during this period of time.


Regards,
  Yan



First, at startup no slot is allocated for a node at all
(confirmed with "sbd list"). I manually allocated slots for both nodes,
then I see that stonith agent does post "reboot" message (confirmed with
"sbd list" again) and sbd never reacts to it. Even after system reboot
message on disk is not cleared.

Removing SBD_DELAY_START and restarting pacemaker (with implicit SBD
restart) immediately cleared pending messages.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: pacemaker with sbd fails to start if node reboots too fast.

2017-12-01 Thread Gao,Yan

On 11/30/2017 06:48 PM, Andrei Borzenkov wrote:

30.11.2017 16:11, Klaus Wenninger пишет:

On 11/30/2017 01:41 PM, Ulrich Windl wrote:



"Gao,Yan" <y...@suse.com> schrieb am 30.11.2017 um 11:48 in Nachricht

<e71afccc-06e3-97dd-c66a-1b4bac550...@suse.com>:

On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:

SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
VM on VSphere using shared VMDK as SBD. During basic tests by killing
corosync and forcing STONITH pacemaker was not started after reboot.
In logs I see during boot

Nov 22 16:04:56 sapprod01s crmd[3151]: crit: We were allegedly
just fenced by sapprod01p for sapprod01p
Nov 22 16:04:56 sapprod01s pacemakerd[3137]:  warning: The crmd
process (3151) can no longer be respawned,
Nov 22 16:04:56 sapprod01s pacemakerd[3137]:   notice: Shutting down

Pacemaker

SBD timeouts are 60s for watchdog and 120s for msgwait. It seems that
stonith with SBD always takes msgwait (at least, visually host is not
declared as OFFLINE until 120s passed). But VM rebots lightning fast
and is up and running long before timeout expires.

As msgwait was intended for the message to arrive, and not for the reboot time 
(I guess), this just shows a fundamental problem in SBD design: Receipt of the 
fencing command is not confirmed (other than by seeing the consequences of ist 
execution).


The 2 x msgwait is not for confirmations but for writing the poison-pill
and for
having it read by the target-side.


Yes, of course, but that's not what Urlich likely intended to say.
msgwait must account for worst case storage path latency, while in
normal cases it happens much faster. If fenced node could acknowledge
having been killed after reboot, stonith agent could return success much
earlier.

How could an alive man be sure he died before? ;)

Regards,
  Yan



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: pacemaker with sbd fails to start if node reboots too fast.

2017-12-01 Thread Gao,Yan

On 11/30/2017 01:41 PM, Ulrich Windl wrote:




"Gao,Yan" <y...@suse.com> schrieb am 30.11.2017 um 11:48 in Nachricht

<e71afccc-06e3-97dd-c66a-1b4bac550...@suse.com>:

On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:

SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
VM on VSphere using shared VMDK as SBD. During basic tests by killing
corosync and forcing STONITH pacemaker was not started after reboot.
In logs I see during boot

Nov 22 16:04:56 sapprod01s crmd[3151]: crit: We were allegedly
just fenced by sapprod01p for sapprod01p
Nov 22 16:04:56 sapprod01s pacemakerd[3137]:  warning: The crmd
process (3151) can no longer be respawned,
Nov 22 16:04:56 sapprod01s pacemakerd[3137]:   notice: Shutting down

Pacemaker


SBD timeouts are 60s for watchdog and 120s for msgwait. It seems that
stonith with SBD always takes msgwait (at least, visually host is not
declared as OFFLINE until 120s passed). But VM rebots lightning fast
and is up and running long before timeout expires.


As msgwait was intended for the message to arrive, and not for the reboot time (I guess), 
The msgwait timer on the sender starts only after a successful writing. 
The recipient will either eat the pill or get killed by watchdog within 
watchdog timeout. As mentioned in sbd man, msgwait should be twice the 
watchdog timeout. So that the sender can safely assume the target is 
dead when the msgwait timer is popped.


Regards,
  Yan



this just shows a fundamental problem in SBD design: Receipt of the fencing 
command is not confirmed (other than by seeing the consequences of ist 
execution).

So the fencing node will see the other host is down (on the network), but it 
won't believe it until SBD msgwait is over. OTOH if your msgwait is very low, 
and the storage has a problem (exceeding msgwait), the node will assume a 
successful fencing when in fact it didn't complete.

So maybe there should be two timeouts: One for the command to be delivered 
(without needing a confirmation, but the confirmation could shorten the wait), 
and another for executing the command (how long will it take from receipt of 
the command until the host is definitely down). Again a confirmation could stop 
waiting before the timeout is reached.

Regards,
Ulrich




I think I have seen similar report already. Is it something that can
be fixed by SBD/pacemaker tuning?

SBD_DELAY_START=yes in /etc/sysconfig/sbd is the solution.

Regards,
Yan



I can provide full logs tomorrow if needed.

TIA

-andrei

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] pacemaker with sbd fails to start if node reboots too fast.

2017-11-30 Thread Gao,Yan

On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:

SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
VM on VSphere using shared VMDK as SBD. During basic tests by killing
corosync and forcing STONITH pacemaker was not started after reboot.
In logs I see during boot

Nov 22 16:04:56 sapprod01s crmd[3151]: crit: We were allegedly
just fenced by sapprod01p for sapprod01p
Nov 22 16:04:56 sapprod01s pacemakerd[3137]:  warning: The crmd
process (3151) can no longer be respawned,
Nov 22 16:04:56 sapprod01s pacemakerd[3137]:   notice: Shutting down Pacemaker

SBD timeouts are 60s for watchdog and 120s for msgwait. It seems that
stonith with SBD always takes msgwait (at least, visually host is not
declared as OFFLINE until 120s passed). But VM rebots lightning fast
and is up and running long before timeout expires.

I think I have seen similar report already. Is it something that can
be fixed by SBD/pacemaker tuning?

SBD_DELAY_START=yes in /etc/sysconfig/sbd is the solution.

Regards,
  Yan



I can provide full logs tomorrow if needed.

TIA

-andrei

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] questions about startup fencing

2017-11-30 Thread Gao,Yan

On 11/30/2017 09:14 AM, Andrei Borzenkov wrote:

On Wed, Nov 29, 2017 at 6:54 PM, Ken Gaillot  wrote:


The same scenario is why a single node can't have quorum at start-up in
a cluster with "two_node" set. Both nodes have to see each other at
least once before they can assume it's safe to do anything.



Unless we set no-quorum-policy=ignore in which case it will proceed
after fencing another node. As far as I'm understand this is the only
way to get number of active cluster nodes below quorum, right?
To be safe, "two_node: 1" automatically enables "wait_for_all". Of 
course one can explicitly disable "wait_for_all" if they know what they 
are doing.


Regards,
  Yan




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] questions about startup fencing

2017-11-29 Thread Gao,Yan

On 11/29/2017 04:54 PM, Ken Gaillot wrote:

On Wed, 2017-11-29 at 14:22 +, Adam Spiers wrote:

The same questions apply if this troublesome node was actually a
remote node running pacemaker_remoted, rather than the 5th node in
the
cluster.


Remote nodes don't join at the crmd level as cluster nodes do, so they
don't "start up" in the same sense, and start-up fencing doesn't apply
to them. Instead, the cluster initiates the connection when called for
(I don't remember for sure whether it fences the remote node if the
connection fails, but that would make sense).
According to link_rsc2remotenode() and handle_startup_fencing(), similar 
"startup-fencing applies to remote nodes too. So if a remote resource 
fails to start, the remote node will be fenced. A global setting 
statup-fencing=false will change the behavior for remote nodes too.


Regards,
  Yan

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org