Hi Ulrich,
On 2022/6/1 7:59, Ulrich Windl wrote:
Hi!
I'm wondering what the number in parentheses is for these messages:
sbd[6809]: warning: inquisitor_child: pcmk health check: UNHEALTHY
sbd[6809]: warning: inquisitor_child: Servant pcmk is outdated (age: 682915)
As we know, each sbd watch
Hi Ulrich,
On 2022/3/31 11:18, Gao,Yan via Users wrote:
On 2022/3/31 9:03, Ulrich Windl wrote:
Hi!
I just wanted to point out one thing that hit us with SLES15 SP3:
Some failed live VM migration causing node fencing resulted in a
fencing loop, because of two reasons:
1) Pacemaker thinks
On 2022/5/25 8:10, Ulrich Windl wrote:
Hi!
We are still suffering from kernel RAM corruption on the Xen hypervisor when a
VM or the hypervisor is doing I/O (three months since the bug report at SUSE,
but no fix or workaround meaning the whole Xen cluster project was canceled
after 20 years, b
Hi Ulrich,
On 2022/4/27 11:13, Ulrich Windl wrote:
Update for the Update:
I had installed SLES Updates in one VM and rebooted it via cluster. While
installing the updates in the VM the Xen host got RAM corruption (it seems any
disk I/O on the host, either locally or via a VM image causes RAM co
On 2022/4/4 8:58, Ulrich Windl wrote:
Andrei Borzenkov schrieb am 04.04.2022 um 06:39 in
Nachricht :
On 31.03.2022 14:02, Ulrich Windl wrote:
"Gao,Yan" schrieb am 31.03.2022 um 11:18 in Nachricht
<67785c2f‑f875‑cb16‑608b‑77d63d9b0...@suse.com>:
On 2022/3/31 9:03, Ulrich
On 2022/3/31 9:03, Ulrich Windl wrote:
Hi!
I just wanted to point out one thing that hit us with SLES15 SP3:
Some failed live VM migration causing node fencing resulted in a fencing loop,
because of two reasons:
1) Pacemaker thinks that even _after_ fencing there is some migration to "clean
u
Hi,
On 2021/2/12 11:05, Lentes, Bernd wrote:
Hi,
i have problems with a configured alert which does not alert anymore.
I played a bit around with it and changed several times the configuration with
cibadmin.
Sometimes i had trouble with the admin_epoch, sometimes with the scheme.
When i invoke
On 1/13/21 9:14 AM, Ulrich Windl wrote:
Hi!
I had made a test: I had configured RAM requirements for some test VMs together
with node RAM capacities. Things were running fine.
Then as a test I reduced the RAM capacity of all nodes, and test VMs were
stopped due to not enough RAM.
Now I wonder:
On 11/26/20 8:31 AM, Ulrich Windl wrote:
Hi!
Using SBD, I got this message from crm's top-level "verify":
crm(live/h16)# verify
Current cluster status:
Online: [ h16 h18 h19 ]
prm_stonith_sbd(stonith:external/sbd): Started h18
(unpack_config) notice: Watchdog will be used via
On 2020/3/23 14:04, Gao,Yan wrote:
On 2020/3/23 8:00, Ulrich Windl wrote:
Andrei Borzenkov schrieb am 21.03.2020 um
18:22 in
Nachricht
<14318_1584811393_5E764D80_14318_174_1_6ab730d7-8cf0-2c7d-7ae5-8d0ea8402758@gmai
.com>:
21.03.2020 20:07, Ken Gaillot пишет:
Hi all,
I am ha
On 2020/3/23 8:00, Ulrich Windl wrote:
Andrei Borzenkov schrieb am 21.03.2020 um 18:22 in
Nachricht
<14318_1584811393_5E764D80_14318_174_1_6ab730d7-8cf0-2c7d-7ae5-8d0ea8402758@gmai
.com>:
21.03.2020 20:07, Ken Gaillot пишет:
Hi all,
I am happy to announce a feature that was discussed on thi
ply (pcmk_delay_base, etc.).
I'd like to recognize the primary authors of the 2.0.4 features
announced so far:
- shutdown locks: myself
- switch to clock_gettime() for monotonic clock: Jan Pokorný
- crm_mon --include/--exclude: Chris Lumens
- priority-fencing-delay: Gao,Yan
--
Ken Gaillot
roperties apply (pcmk_delay_base, etc.).
I'd like to recognize the primary authors of the 2.0.4 features
announced so far:
- shutdown locks: myself
- switch to clock_gettime() for monotonic clock: Jan Pokorný
- crm_mon --include/--exclude: Chris Lumens
- priority-fe
ing "quorate". This is the safest approach for good in
case it's split-brain. This already works correctly with the fix in
regard of 2-node cluster from Klaus.
Regards,
Yan
Many Thanks!!!
Reagards
Fulong
--
Labs - All topics related to open-source clustering
welcomed; Fulong Wang; Gao,Yan
*Subject:* Re: [ClusterLabs] SuSE12SP3 HAE SBD Communication Issue
On 02/11/2019 09:49 AM, Fulong Wang wrote:
Thanks Yan,
You gave me more valuable hints on the SBD operation!
Now, i can see the verbose output
-repo.
Given that it was done by Yan Gao iirc I'd assume it went into SLES.
So changing the verbosity of the sbd-daemon might get you back
these logs.
Do you mean
commit 2dbdee29736fcbf0fe1d41c306959b22d05f72b0
Author: Gao,Yan
Date: Mon Apr 30 18:02:04 2018 +0200
Log: upgrade importan
des to self-fence). In short, it will
not tolerate any further faults. Please repair the system before
continuing."
Regards,
Yan
what's your recommendation for this scenario?
The "crm node fence" did the work.
Regards
Fulong
---
First thanks for your reply, Klaus!
On 2018/12/21 10:09, Klaus Wenninger wrote:
On 12/21/2018 08:15 AM, Fulong Wang wrote:
Hello Experts,
I'm New to this mail lists.
Pls kindlyforgive me if this mail has disturb you!
Our Company recently is evaluating the usage of the SuSE HAE on x86
platfor
On 2017/12/16 16:59, Andrei Borzenkov wrote:
04.12.2017 21:55, Andrei Borzenkov пишет:
...
I tried it (on openSUSE Tumbleweed which is what I have at hand, it has
SBD 1.3.0) and with SBD_DELAY_START=yes sbd does not appear to watch
disk at all.
It simply waits that long on startup before start
On 12/05/2017 03:11 PM, Ulrich Windl wrote:
"Gao,Yan" schrieb am 05.12.2017 um 15:04 in Nachricht
:
On 12/05/2017 12:41 PM, Ulrich Windl wrote:
"Gao,Yan" schrieb am 01.12.2017 um 20:36 in Nachricht
:
[...]
I meant: There are three delays:
1) The delay until
On 12/05/2017 12:41 PM, Ulrich Windl wrote:
"Gao,Yan" schrieb am 01.12.2017 um 20:36 in Nachricht
:
On 11/30/2017 06:48 PM, Andrei Borzenkov wrote:
30.11.2017 16:11, Klaus Wenninger пишет:
On 11/30/2017 01:41 PM, Ulrich Windl wrote:
"Gao,Yan" schrieb am 30
On 12/05/2017 08:57 AM, Dejan Muhamedagic wrote:
On Mon, Dec 04, 2017 at 09:55:46PM +0300, Andrei Borzenkov wrote:
04.12.2017 14:48, Gao,Yan пишет:
On 12/02/2017 07:19 PM, Andrei Borzenkov wrote:
30.11.2017 13:48, Gao,Yan пишет:
On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:
SLES12 SP2
On 12/04/2017 07:55 PM, Andrei Borzenkov wrote:
04.12.2017 14:48, Gao,Yan пишет:
On 12/02/2017 07:19 PM, Andrei Borzenkov wrote:
30.11.2017 13:48, Gao,Yan пишет:
On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:
SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
VM on VSphere
On 12/02/2017 08:30 AM, Andrei Borzenkov wrote:
01.12.2017 22:36, Gao,Yan пишет:
On 11/30/2017 06:48 PM, Andrei Borzenkov wrote:
30.11.2017 16:11, Klaus Wenninger пишет:
On 11/30/2017 01:41 PM, Ulrich Windl wrote:
"Gao,Yan" schrieb am 30.11.2017 um 11:48 in
Nachricht
:
On 11/
On 12/02/2017 07:19 PM, Andrei Borzenkov wrote:
30.11.2017 13:48, Gao,Yan пишет:
On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:
SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
VM on VSphere using shared VMDK as SBD. During basic tests by killing
corosync and forcing
On 11/30/2017 06:48 PM, Andrei Borzenkov wrote:
30.11.2017 16:11, Klaus Wenninger пишет:
On 11/30/2017 01:41 PM, Ulrich Windl wrote:
"Gao,Yan" schrieb am 30.11.2017 um 11:48 in Nachricht
:
On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:
SLES12 SP2 with pacemaker 1.1.15-21.1-e1
On 11/30/2017 01:41 PM, Ulrich Windl wrote:
"Gao,Yan" schrieb am 30.11.2017 um 11:48 in Nachricht
:
On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:
SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
VM on VSphere using shared VMDK as SBD. During basic tests
On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:
SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
VM on VSphere using shared VMDK as SBD. During basic tests by killing
corosync and forcing STONITH pacemaker was not started after reboot.
In logs I see during boot
Nov 22 16:04:5
On 11/30/2017 09:14 AM, Andrei Borzenkov wrote:
On Wed, Nov 29, 2017 at 6:54 PM, Ken Gaillot wrote:
The same scenario is why a single node can't have quorum at start-up in
a cluster with "two_node" set. Both nodes have to see each other at
least once before they can assume it's safe to do anyt
On 11/29/2017 04:54 PM, Ken Gaillot wrote:
On Wed, 2017-11-29 at 14:22 +, Adam Spiers wrote:
The same questions apply if this troublesome node was actually a
remote node running pacemaker_remoted, rather than the 5th node in
the
cluster.
Remote nodes don't join at the crmd level as cluster
now similar problem?
> Know the cause of the problem?
Sounds weird. I've never encountered the issue before. Actually I
haven't run it with heartbeat for years ;-) We'd probably have to find
the pattern and produce it.
Regards,
Yan
--
Gao,Yan
Senior Software Engineer
31 matches
Mail list logo