Re: [ClusterLabs] DC marks itself as OFFLINE, continues orchestrating the other nodes

2022-09-29 Thread Ken Gaillot
I suspect this is fixed in newer versions. It's not a join timing issue but some sort of peer state bug, and there's been a good bit of change in that area since this code. A few comments inline ... On Wed, 2022-09-14 at 12:40 +0200, Lars Ellenberg wrote: > On Thu, Sep 08, 2022 at 10:11:46AM -050

Re: [ClusterLabs] DC marks itself as OFFLINE, continues orchestrating the other nodes

2022-09-14 Thread Lars Ellenberg
On Thu, Sep 08, 2022 at 10:11:46AM -0500, Ken Gaillot wrote: > On Thu, 2022-09-08 at 15:01 +0200, Lars Ellenberg wrote: > > Scenario: > > three nodes, no fencing (I know) > > break network, isolating nodes > > unbreak network, see how cluster partitions rejoin and resume service > > I'm guessing t

Re: [ClusterLabs] DC marks itself as OFFLINE, continues orchestrating the other nodes

2022-09-08 Thread Ken Gaillot
On Thu, 2022-09-08 at 15:01 +0200, Lars Ellenberg wrote: > Scenario: > three nodes, no fencing (I know) > break network, isolating nodes > unbreak network, see how cluster partitions rejoin and resume service I'm guessing the CIB changed during the break, with more changes in one of the other part

[ClusterLabs] DC marks itself as OFFLINE, continues orchestrating the other nodes

2022-09-08 Thread Lars Ellenberg
Scenario: three nodes, no fencing (I know) break network, isolating nodes unbreak network, see how cluster partitions rejoin and resume service Funny outcome: /usr/sbin/crm_mon -x pe-input-689.bz2 Cluster Summary: * Stack: corosync * Current DC: mqhavm24 (version 1.1.24.linbit-2.0.el7-8f22