Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-12 Thread Klaus Wenninger
On 1/12/21 3:01 PM, renayama19661...@ybb.ne.jp wrote:
> Hi Steffen,
>
> As Klaus says, if you stop the node, the problem display disappears.
Maybe let's stress one more time here that all nodes have to be in the
stopped
state at the same time for at least a short moment for the history to go
away -
both the good and the bad.

Klaus
> However, the cause of the problem is currently unknown.
> To find out the cause, it needs to be reproduced, but the procedure is 
> unknown at this time.
>
>
> The problem is different from the Bugzilla fix I have addressed, as the 
> topology is not set.
>
>
> Best Regards,
> Hideo Yamauchi.
>
>
> - Original Message -
>> From: Klaus Wenninger 
>> To: Steffen Vinther Sørensen  
>> Cc: renayama19661...@ybb.ne.jp; Cluster Labs - All topics related to 
>> open-source clustering welcomed 
>> Date: 2021/1/12, Tue 21:29
>> Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
>>
>>
>> On 1/12/21 12:46 PM, Steffen Vinther Sørensen wrote:
>>
>> Yes.  
>>>
>>> 'pcs cluster stop --all' + reboot all nodes
> Thanks! That is the ultimate action ;-)
>> Just starting the cluster via pcs would probably already
>> have the effect of making the pending fence actions
>> go away.
>> But still we should try to somehow reproduce the issue
>> as it shouldn't happen.
>>
>> Klaus
>>
>>
>>> /Steffen
>>>
>>> On Tue, Jan 12, 2021 at 11:43 AM Klaus Wenninger  
>>> wrote:
>>>
>>> On 1/12/21 11:23 AM, Steffen Vinther Sørensen wrote:
>>>> Hello Hideo. 
>>>>>
>>>>> I am overwhelmed by how serious this group is taking good care of issues. 
>>>>>
>>>>>
>>>>> For your information, the 'pending fencing action' status disappeared 
>>>>> after bringing the nodes offline, and during that I found some gfs2 
>>>>> errors that were fixed by fsck.gfs2, and since then my cluster has been 
>>>>> acting very stable. 
>>>>> By bringing offline you mean shutting down pacemaker?
>>>> That would be expected as fence-history is kept solely in RAM.
>>>> The history-knowledge is synced between the nodes so the
>>>> history is just lost if all nodes are down at the same time.
>>>> Unfortunately that mechanism keeps unwanted leftovers
>>>> around as well.
>>>>
>>>> Regards,
>>>> Klaus
>>>>
>>>>
>>>>> If I can provide more info let me know. 
>>>>>
>>>>>
>>>>> /Steffen
>>>>>
>>>>> On Tue, Jan 12, 2021 at 3:45 AM  wrote:
>>>>>
>>>>> Hi Steffen,
>>>>>> I've been experimenting with it since last weekend,
>   but I haven't been able to reproduce the same
>   situation.
>>>>>> It seems that the cause is that the reproduction
>   method cannot be limited.
>>>>>> Can I attach a problem log?
>>>>>>
>>>>>> Best Regards,
>>>>>> Hideo Yamauchi.
>>>>>>
>>>>>>
>>>>>> - Original Message -
>>>>>>> From: Klaus Wenninger 
>>>>>>> To: Steffen Vinther Sørensen ; Cluster Labs - All 
>>>>>>> topics related to open-source clustering welcomed 
>>>>>>> 
>>>>>>> Cc: 
>>>>>>> Date: 2021/1/7, Thu 21:42
>>>>>>> Subject: Re: [ClusterLabs] Pending Fencing
>   Actions shown in pcs status
>>>>>>> On 1/7/21 1:13 PM, Steffen Vinther Sørensen
>   wrote:
>>>>>>>>   Hi Klaus,
>>>>>>>>
>>>>>>>>   Yes then the status does sync to the other
>   nodes. Also it looks like
>>>>>>>>   there are some hostname resolving problems
>   in play here, maybe causing
>>>>>>>>   problems,  here is my notes from restarting
>   pacemaker etc.
>>>>>>> Don't think there are hostname resolving
>   problems.
>>>>>>> The messages you are seeing, that look as if, are
>   caused
>>>>>>> by using -EHOSTUNREACH as error-code to fail a
>   pending
>>>>>>> fence action when a node that is just coming up
>    

Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-12 Thread Steffen Vinther Sørensen
Yes.

'pcs cluster stop --all' + reboot all nodes

/Steffen

On Tue, Jan 12, 2021 at 11:43 AM Klaus Wenninger 
wrote:

> On 1/12/21 11:23 AM, Steffen Vinther Sørensen wrote:
>
> Hello Hideo.
>
> I am overwhelmed by how serious this group is taking good care of issues.
>
> For your information, the 'pending fencing action' status
> disappeared after bringing the nodes offline, and during that I found some
> gfs2 errors that were fixed by fsck.gfs2, and since then my cluster has
> been acting very stable.
>
> By bringing offline you mean shutting down pacemaker?
> That would be expected as fence-history is kept solely in RAM.
> The history-knowledge is synced between the nodes so the
> history is just lost if all nodes are down at the same time.
> Unfortunately that mechanism keeps unwanted leftovers
> around as well.
>
> Regards,
> Klaus
>
>
> If I can provide more info let me know.
>
> /Steffen
>
> On Tue, Jan 12, 2021 at 3:45 AM  wrote:
>
>> Hi Steffen,
>>
>> I've been experimenting with it since last weekend, but I haven't been
>> able to reproduce the same situation.
>> It seems that the cause is that the reproduction method cannot be limited.
>>
>> Can I attach a problem log?
>>
>> Best Regards,
>> Hideo Yamauchi.
>>
>>
>> - Original Message -
>> > From: Klaus Wenninger 
>> > To: Steffen Vinther Sørensen ; Cluster Labs - All
>> topics related to open-source clustering welcomed 
>> > Cc:
>> > Date: 2021/1/7, Thu 21:42
>> > Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
>> >
>> > On 1/7/21 1:13 PM, Steffen Vinther Sørensen wrote:
>> >>  Hi Klaus,
>> >>
>> >>  Yes then the status does sync to the other nodes. Also it looks like
>> >>  there are some hostname resolving problems in play here, maybe causing
>> >>  problems,  here is my notes from restarting pacemaker etc.
>> > Don't think there are hostname resolving problems.
>> > The messages you are seeing, that look as if, are caused
>> > by using -EHOSTUNREACH as error-code to fail a pending
>> > fence action when a node that is just coming up sees
>> > a pending action that is claimed to be handled by himself.
>> > Back then I chose that error-code as there was none
>> > that really matched available right away and it was
>> > urgent for some reason so that introduction of something
>> > new was to risky at that state.
>> > Probably would make sense to introduce something that
>> > is more descriptive.
>> > Back then the issue was triggered by fenced crashing and
>> > being restarted - so not a node-restart but just fenced
>> > restarting.
>> > And it looks as if building the failed-message failed somehow.
>> > So that could be the reason why the pending action persists.
>> > Would be something else then what we solved with Bug 5401.
>> > But what triggers the logs below might as well just be a
>> > follow-up issue after the Bug 5401 thing.
>> > Will try to find time for a deeper look later today.
>> >
>> > Klaus
>> >>
>> >>  pcs cluster standby kvm03-node02.avigol-gcs.dk
>> >>  pcs cluster stop kvm03-node02.avigol-gcs.dk
>> >>  pcs status
>> >>
>> >>  Pending Fencing Actions:
>> >>  * reboot of kvm03-node02.avigol-gcs.dk pending: client=crmd.37819,
>> >>  origin=kvm03-node03.avigol-gcs.dk
>> >>
>> >>  # From logs on all 3 nodes:
>> >>  Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:  warning: received
>> >>  pending action we are supposed to be the owner but it's not in our
>> >>  records -> fail it
>> >>  Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:error: Operation
>> >>  'reboot' targeting kvm03-node02.avigol-gcs.dk on  for
>> >>  crmd.37...@kvm03-node03.avigol-gcs.dk.56a3018c: No route to host
>> >>  Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:error:
>> >>  stonith_construct_reply: Triggered assert at commands.c:2406 : request
>> >>  != NULL
>> >>  Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:  warning: Can't create
>> >>  a sane reply
>> >>  Jan 07 12:48:18 kvm03-node03 crmd[37819]:   notice: Peer
>> >>  kvm03-node02.avigol-gcs.dk was not terminated (reboot) by  on
>> >>  behalf of crmd.37819: No route to host
>> >>
>> >>  pcs cluster start kvm03-node02.avigol-gcs.dk
>> >>  pcs status (now ou

Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-12 Thread Klaus Wenninger
On 1/12/21 11:23 AM, Steffen Vinther Sørensen wrote:
> Hello Hideo.
>
> I am overwhelmed by how serious this group is taking good care of issues. 
>
> For your information, the 'pending fencing action' status
> disappeared after bringing the nodes offline, and during that I found
> some gfs2 errors that were fixed by fsck.gfs2, and since then my
> cluster has been acting very stable.
By bringing offline you mean shutting down pacemaker?
That would be expected as fence-history is kept solely in RAM.
The history-knowledge is synced between the nodes so the
history is just lost if all nodes are down at the same time.
Unfortunately that mechanism keeps unwanted leftovers
around as well.

Regards,
Klaus
>
> If I can provide more info let me know. 
>
> /Steffen
>
> On Tue, Jan 12, 2021 at 3:45 AM  <mailto:renayama19661...@ybb.ne.jp>> wrote:
>
> Hi Steffen,
>
> I've been experimenting with it since last weekend, but I haven't
> been able to reproduce the same situation.
> It seems that the cause is that the reproduction method cannot be
> limited.
>
> Can I attach a problem log?
>
> Best Regards,
> Hideo Yamauchi.
>
>
> - Original Message -
> > From: Klaus Wenninger  <mailto:kwenn...@redhat.com>>
> > To: Steffen Vinther Sørensen  <mailto:svint...@gmail.com>>; Cluster Labs - All topics related to
> open-source clustering welcomed  <mailto:users@clusterlabs.org>>
> > Cc:
> > Date: 2021/1/7, Thu 21:42
> > Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs
> status
> >
> > On 1/7/21 1:13 PM, Steffen Vinther Sørensen wrote:
> >>  Hi Klaus,
> >>
> >>  Yes then the status does sync to the other nodes. Also it
> looks like
> >>  there are some hostname resolving problems in play here, maybe
> causing
> >>  problems,  here is my notes from restarting pacemaker etc.
> > Don't think there are hostname resolving problems.
> > The messages you are seeing, that look as if, are caused
> > by using -EHOSTUNREACH as error-code to fail a pending
> > fence action when a node that is just coming up sees
> > a pending action that is claimed to be handled by himself.
> > Back then I chose that error-code as there was none
> > that really matched available right away and it was
> > urgent for some reason so that introduction of something
> > new was to risky at that state.
> > Probably would make sense to introduce something that
> > is more descriptive.
> > Back then the issue was triggered by fenced crashing and
> > being restarted - so not a node-restart but just fenced
> > restarting.
> > And it looks as if building the failed-message failed somehow.
> > So that could be the reason why the pending action persists.
> > Would be something else then what we solved with Bug 5401.
> > But what triggers the logs below might as well just be a
> > follow-up issue after the Bug 5401 thing.
> > Will try to find time for a deeper look later today.
> >
> > Klaus
> >>
> >>  pcs cluster standby kvm03-node02.avigol-gcs.dk
> <http://kvm03-node02.avigol-gcs.dk>
> >>  pcs cluster stop kvm03-node02.avigol-gcs.dk
> <http://kvm03-node02.avigol-gcs.dk>
> >>  pcs status
> >>
> >>  Pending Fencing Actions:
> >>  * reboot of kvm03-node02.avigol-gcs.dk
> <http://kvm03-node02.avigol-gcs.dk> pending: client=crmd.37819,
> >>  origin=kvm03-node03.avigol-gcs.dk
> <http://kvm03-node03.avigol-gcs.dk>
> >>
> >>  # From logs on all 3 nodes:
> >>  Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:  warning: received
> >>  pending action we are supposed to be the owner but it's not in our
> >>  records -> fail it
> >>  Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:    error:
> Operation
> >>  'reboot' targeting kvm03-node02.avigol-gcs.dk
> <http://kvm03-node02.avigol-gcs.dk> on  for
> >>  crmd.37...@kvm03-node03.avigol-gcs.dk.56a3018c: No route to host
> >>  Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:    error:
> >>  stonith_construct_reply: Triggered assert at commands.c:2406 :
> request
> >>  != NULL
> >>  Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:  warning:
> Can't create
> >>  a sane reply
> >>  Jan 07 12:48:18 kvm03-nod

Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-12 Thread Steffen Vinther Sørensen
Hello Hideo.

I am overwhelmed by how serious this group is taking good care of issues.

For your information, the 'pending fencing action' status disappeared after
bringing the nodes offline, and during that I found some gfs2 errors that
were fixed by fsck.gfs2, and since then my cluster has been acting very
stable.

If I can provide more info let me know.

/Steffen

On Tue, Jan 12, 2021 at 3:45 AM  wrote:

> Hi Steffen,
>
> I've been experimenting with it since last weekend, but I haven't been
> able to reproduce the same situation.
> It seems that the cause is that the reproduction method cannot be limited.
>
> Can I attach a problem log?
>
> Best Regards,
> Hideo Yamauchi.
>
>
> - Original Message -
> > From: Klaus Wenninger 
> > To: Steffen Vinther Sørensen ; Cluster Labs - All
> topics related to open-source clustering welcomed 
> > Cc:
> > Date: 2021/1/7, Thu 21:42
> > Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
> >
> > On 1/7/21 1:13 PM, Steffen Vinther Sørensen wrote:
> >>  Hi Klaus,
> >>
> >>  Yes then the status does sync to the other nodes. Also it looks like
> >>  there are some hostname resolving problems in play here, maybe causing
> >>  problems,  here is my notes from restarting pacemaker etc.
> > Don't think there are hostname resolving problems.
> > The messages you are seeing, that look as if, are caused
> > by using -EHOSTUNREACH as error-code to fail a pending
> > fence action when a node that is just coming up sees
> > a pending action that is claimed to be handled by himself.
> > Back then I chose that error-code as there was none
> > that really matched available right away and it was
> > urgent for some reason so that introduction of something
> > new was to risky at that state.
> > Probably would make sense to introduce something that
> > is more descriptive.
> > Back then the issue was triggered by fenced crashing and
> > being restarted - so not a node-restart but just fenced
> > restarting.
> > And it looks as if building the failed-message failed somehow.
> > So that could be the reason why the pending action persists.
> > Would be something else then what we solved with Bug 5401.
> > But what triggers the logs below might as well just be a
> > follow-up issue after the Bug 5401 thing.
> > Will try to find time for a deeper look later today.
> >
> > Klaus
> >>
> >>  pcs cluster standby kvm03-node02.avigol-gcs.dk
> >>  pcs cluster stop kvm03-node02.avigol-gcs.dk
> >>  pcs status
> >>
> >>  Pending Fencing Actions:
> >>  * reboot of kvm03-node02.avigol-gcs.dk pending: client=crmd.37819,
> >>  origin=kvm03-node03.avigol-gcs.dk
> >>
> >>  # From logs on all 3 nodes:
> >>  Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:  warning: received
> >>  pending action we are supposed to be the owner but it's not in our
> >>  records -> fail it
> >>  Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:error: Operation
> >>  'reboot' targeting kvm03-node02.avigol-gcs.dk on  for
> >>  crmd.37...@kvm03-node03.avigol-gcs.dk.56a3018c: No route to host
> >>  Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:error:
> >>  stonith_construct_reply: Triggered assert at commands.c:2406 : request
> >>  != NULL
> >>  Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:  warning: Can't create
> >>  a sane reply
> >>  Jan 07 12:48:18 kvm03-node03 crmd[37819]:   notice: Peer
> >>  kvm03-node02.avigol-gcs.dk was not terminated (reboot) by  on
> >>  behalf of crmd.37819: No route to host
> >>
> >>  pcs cluster start kvm03-node02.avigol-gcs.dk
> >>  pcs status (now outputs the same on all 3 nodes)
> >>
> >>  Failed Fencing Actions:
> >>  * reboot of kvm03-node02.avigol-gcs.dk failed: delegate=,
> >>  client=crmd.37819, origin=kvm03-node03.avigol-gcs.dk,
> >>  last-failed='Thu Jan  7 12:48:18 2021'
> >>
> >>
> >>  pcs cluster unstandby kvm03-node02.avigol-gcs.dk
> >>
> >>  # Now libvirtd refuses to start
> >>
> >>  Jan 07 12:51:44 kvm03-node02 dnsmasq[20884]: read /etc/hosts - 8
> addresses
> >>  Jan 07 12:51:44 kvm03-node02 dnsmasq[20884]: read
> >>  /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
> >>  Jan 07 12:51:44 kvm03-node02 dnsmasq-dhcp[20884]: read
> >>  /var/lib/libvirt/dnsmasq/default.hostsfile
> >>  Jan 07 12:51:44 kvm03-node02 libvirtd[24091]: 2021-01-07
> >>  11:51:44.729+: 24160: info : l

Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-11 Thread renayama19661014
Hi Steffen,

I've been experimenting with it since last weekend, but I haven't been able to 
reproduce the same situation.
It seems that the cause is that the reproduction method cannot be limited.

Can I attach a problem log?

Best Regards,
Hideo Yamauchi.


- Original Message -
> From: Klaus Wenninger 
> To: Steffen Vinther Sørensen ; Cluster Labs - All topics 
> related to open-source clustering welcomed 
> Cc: 
> Date: 2021/1/7, Thu 21:42
> Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
> 
> On 1/7/21 1:13 PM, Steffen Vinther Sørensen wrote:
>>  Hi Klaus,
>> 
>>  Yes then the status does sync to the other nodes. Also it looks like
>>  there are some hostname resolving problems in play here, maybe causing
>>  problems,  here is my notes from restarting pacemaker etc.
> Don't think there are hostname resolving problems.
> The messages you are seeing, that look as if, are caused
> by using -EHOSTUNREACH as error-code to fail a pending
> fence action when a node that is just coming up sees
> a pending action that is claimed to be handled by himself.
> Back then I chose that error-code as there was none
> that really matched available right away and it was
> urgent for some reason so that introduction of something
> new was to risky at that state.
> Probably would make sense to introduce something that
> is more descriptive.
> Back then the issue was triggered by fenced crashing and
> being restarted - so not a node-restart but just fenced
> restarting.
> And it looks as if building the failed-message failed somehow.
> So that could be the reason why the pending action persists.
> Would be something else then what we solved with Bug 5401.
> But what triggers the logs below might as well just be a
> follow-up issue after the Bug 5401 thing.
> Will try to find time for a deeper look later today.
> 
> Klaus
>> 
>>  pcs cluster standby kvm03-node02.avigol-gcs.dk
>>  pcs cluster stop kvm03-node02.avigol-gcs.dk
>>  pcs status
>> 
>>  Pending Fencing Actions:
>>  * reboot of kvm03-node02.avigol-gcs.dk pending: client=crmd.37819,
>>  origin=kvm03-node03.avigol-gcs.dk
>> 
>>  # From logs on all 3 nodes:
>>  Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:  warning: received
>>  pending action we are supposed to be the owner but it's not in our
>>  records -> fail it
>>  Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:    error: Operation
>>  'reboot' targeting kvm03-node02.avigol-gcs.dk on  for
>>  crmd.37...@kvm03-node03.avigol-gcs.dk.56a3018c: No route to host
>>  Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:    error:
>>  stonith_construct_reply: Triggered assert at commands.c:2406 : request
>>  != NULL
>>  Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:  warning: Can't create
>>  a sane reply
>>  Jan 07 12:48:18 kvm03-node03 crmd[37819]:   notice: Peer
>>  kvm03-node02.avigol-gcs.dk was not terminated (reboot) by  on
>>  behalf of crmd.37819: No route to host
>> 
>>  pcs cluster start kvm03-node02.avigol-gcs.dk
>>  pcs status (now outputs the same on all 3 nodes)
>> 
>>  Failed Fencing Actions:
>>  * reboot of kvm03-node02.avigol-gcs.dk failed: delegate=,
>>  client=crmd.37819, origin=kvm03-node03.avigol-gcs.dk,
>>      last-failed='Thu Jan  7 12:48:18 2021'
>> 
>> 
>>  pcs cluster unstandby kvm03-node02.avigol-gcs.dk
>> 
>>  # Now libvirtd refuses to start
>> 
>>  Jan 07 12:51:44 kvm03-node02 dnsmasq[20884]: read /etc/hosts - 8 addresses
>>  Jan 07 12:51:44 kvm03-node02 dnsmasq[20884]: read
>>  /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
>>  Jan 07 12:51:44 kvm03-node02 dnsmasq-dhcp[20884]: read
>>  /var/lib/libvirt/dnsmasq/default.hostsfile
>>  Jan 07 12:51:44 kvm03-node02 libvirtd[24091]: 2021-01-07
>>  11:51:44.729+: 24160: info : libvirt version: 4.5.0, package:
>>  36.el7_9.3 (CentOS BuildSystem <http://bugs.centos.org >,
>>  2020-11-16-16:25:20, x86-01.bsys.centos.org)
>>  Jan 07 12:51:44 kvm03-node02 libvirtd[24091]: 2021-01-07
>>  11:51:44.729+: 24160: info : hostname: kvm03-node02
>>  Jan 07 12:51:44 kvm03-node02 libvirtd[24091]: 2021-01-07
>>  11:51:44.729+: 24160: error : qemuMonitorOpenUnix:392 : failed to
>>  connect to monitor socket: Connection refused
>>  Jan 07 12:51:44 kvm03-node02 libvirtd[24091]: 2021-01-07
>>  11:51:44.729+: 24159: error : qemuMonitorOpenUnix:392 : failed to
>>  connect to monitor socket: Connection refused
>>  Jan 07 12:51:44 kvm03-node02 libvirtd[24091]: 2021-01-07
>>  11:51:44.730+: 24161: error : qemuMonitorOpenUnix:392 : failed to
>> 

Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-07 Thread Klaus Wenninger
er  wrote:
>> Hi Steffen,
>>
>> If you just see the leftover pending-action on one node
>> it would be interesting if restarting of pacemaker on
>> one of the other nodes does sync it to all of the
>> nodes.
>>
>> Regards,
>> Klaus
>>
>> On 1/7/21 9:54 AM, renayama19661...@ybb.ne.jp wrote:
>>> Hi Steffen,
>>>
>>>> Unfortunately not sure about the exact scenario. But I have been doing
>>>> some recent experiments with node standby/unstandby stop/start. This
>>>> to get procedures right for updating node rpms etc.
>>>>
>>>> Later I noticed the uncomforting "pending fencing actions" status msg.
>>> Okay!
>>>
>>> Repeat the standby and unstandby steps in the same way to check.
>>> We will start checking after tomorrow, so I think it will take some time 
>>> until next week.
>>>
>>>
>>> Many thanks,
>>> Hideo Yamauchi.
>>>
>>>
>>>
>>> - Original Message -
>>>> From: "renayama19661...@ybb.ne.jp" 
>>>> To: Reid Wahl ; Cluster Labs - All topics related to 
>>>> open-source clustering welcomed 
>>>> Cc:
>>>> Date: 2021/1/7, Thu 17:51
>>>> Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
>>>>
>>>> Hi Steffen,
>>>> Hi Reid,
>>>>
>>>> The fencing history is kept inside stonith-ng and is not written to cib.
>>>> However, getting the entire cib and getting it sent will help you to 
>>>> reproduce
>>>> the problem.
>>>>
>>>> Best Regards,
>>>> Hideo Yamauchi.
>>>>
>>>>
>>>> - Original Message -
>>>>> From: Reid Wahl 
>>>>> To: renayama19661...@ybb.ne.jp; Cluster Labs - All topics related to
>>>> open-source clustering welcomed 
>>>>> Date: 2021/1/7, Thu 17:39
>>>>> Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
>>>>>
>>>>>
>>>>> Hi, Steffen. Those attachments don't contain the CIB. They contain the
>>>> `pcs config` output. You can get the cib with `pcs cluster cib >
>>>> $(hostname).cib.xml`.
>>>>> Granted, it's possible that this fence action information wouldn't
>>>> be in the CIB at all. It might be stored in fencer memory.
>>>>> On Thu, Jan 7, 2021 at 12:26 AM  wrote:
>>>>>
>>>>> Hi Steffen,
>>>>>>>  Here CIB settings attached (pcs config show) for all 3 of my nodes
>>>>>>>  (all 3 seems 100% identical), node03 is the DC.
>>>>>> Thank you for the attachment.
>>>>>>
>>>>>> What is the scenario when this situation occurs?
>>>>>> In what steps did the problem appear when fencing was performed (or
>>>> failed)?
>>>>>> Best Regards,
>>>>>> Hideo Yamauchi.
>>>>>>
>>>>>>
>>>>>> - Original Message -
>>>>>>>  From: Steffen Vinther Sørensen 
>>>>>>>  To: renayama19661...@ybb.ne.jp; Cluster Labs - All topics related
>>>> to open-source clustering welcomed 
>>>>>>>  Cc:
>>>>>>>  Date: 2021/1/7, Thu 17:05
>>>>>>>  Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs
>>>> status
>>>>>>>  Hi Hideo,
>>>>>>>
>>>>>>>  Here CIB settings attached (pcs config show) for all 3 of my nodes
>>>>>>>  (all 3 seems 100% identical), node03 is the DC.
>>>>>>>
>>>>>>>  Regards
>>>>>>>  Steffen
>>>>>>>
>>>>>>>  On Thu, Jan 7, 2021 at 8:06 AM 
>>>> wrote:
>>>>>>>>   Hi Steffen,
>>>>>>>>   Hi Reid,
>>>>>>>>
>>>>>>>>   I also checked the Centos source rpm and it seems to include a
>>>> fix for the
>>>>>>>  problem.
>>>>>>>>   As Steffen suggested, if you share your CIB settings, I might
>>>> know
>>>>>>>  something.
>>>>>>>>   If this issue is the same as the fix, the display will only be
>>>> displayed on
>>>>>>>  the DC node and will not affect the 

Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-07 Thread Steffen Vinther Sørensen
Hi Klaus,

Yes then the status does sync to the other nodes. Also it looks like
there are some hostname resolving problems in play here, maybe causing
problems,  here is my notes from restarting pacemaker etc.


pcs cluster standby kvm03-node02.avigol-gcs.dk
pcs cluster stop kvm03-node02.avigol-gcs.dk
pcs status

Pending Fencing Actions:
* reboot of kvm03-node02.avigol-gcs.dk pending: client=crmd.37819,
origin=kvm03-node03.avigol-gcs.dk

# From logs on all 3 nodes:
Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:  warning: received
pending action we are supposed to be the owner but it's not in our
records -> fail it
Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:error: Operation
'reboot' targeting kvm03-node02.avigol-gcs.dk on  for
crmd.37...@kvm03-node03.avigol-gcs.dk.56a3018c: No route to host
Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:error:
stonith_construct_reply: Triggered assert at commands.c:2406 : request
!= NULL
Jan 07 12:48:18 kvm03-node03 stonith-ng[37815]:  warning: Can't create
a sane reply
Jan 07 12:48:18 kvm03-node03 crmd[37819]:   notice: Peer
kvm03-node02.avigol-gcs.dk was not terminated (reboot) by  on
behalf of crmd.37819: No route to host

pcs cluster start kvm03-node02.avigol-gcs.dk
pcs status (now outputs the same on all 3 nodes)

Failed Fencing Actions:
* reboot of kvm03-node02.avigol-gcs.dk failed: delegate=,
client=crmd.37819, origin=kvm03-node03.avigol-gcs.dk,
last-failed='Thu Jan  7 12:48:18 2021'


pcs cluster unstandby kvm03-node02.avigol-gcs.dk

# Now libvirtd refuses to start

Jan 07 12:51:44 kvm03-node02 dnsmasq[20884]: read /etc/hosts - 8 addresses
Jan 07 12:51:44 kvm03-node02 dnsmasq[20884]: read
/var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
Jan 07 12:51:44 kvm03-node02 dnsmasq-dhcp[20884]: read
/var/lib/libvirt/dnsmasq/default.hostsfile
Jan 07 12:51:44 kvm03-node02 libvirtd[24091]: 2021-01-07
11:51:44.729+: 24160: info : libvirt version: 4.5.0, package:
36.el7_9.3 (CentOS BuildSystem <http://bugs.centos.org>,
2020-11-16-16:25:20, x86-01.bsys.centos.org)
Jan 07 12:51:44 kvm03-node02 libvirtd[24091]: 2021-01-07
11:51:44.729+: 24160: info : hostname: kvm03-node02
Jan 07 12:51:44 kvm03-node02 libvirtd[24091]: 2021-01-07
11:51:44.729+: 24160: error : qemuMonitorOpenUnix:392 : failed to
connect to monitor socket: Connection refused
Jan 07 12:51:44 kvm03-node02 libvirtd[24091]: 2021-01-07
11:51:44.729+: 24159: error : qemuMonitorOpenUnix:392 : failed to
connect to monitor socket: Connection refused
Jan 07 12:51:44 kvm03-node02 libvirtd[24091]: 2021-01-07
11:51:44.730+: 24161: error : qemuMonitorOpenUnix:392 : failed to
connect to monitor socket: Connection refused
Jan 07 12:51:44 kvm03-node02 libvirtd[24091]: 2021-01-07
11:51:44.730+: 24162: error : qemuMonitorOpenUnix:392 : failed to
connect to monitor socket: Connection refused

pcs status

Failed Resource Actions:
* libvirtd_start_0 on kvm03-node02.avigol-gcs.dk 'unknown error' (1):
call=142, status=complete, exitreason='',
last-rc-change='Thu Jan  7 12:51:44 2021', queued=0ms, exec=2157ms

Failed Fencing Actions:
* reboot of kvm03-node02.avigol-gcs.dk failed: delegate=,
client=crmd.37819, origin=kvm03-node03.avigol-gcs.dk,
last-failed='Thu Jan  7 12:48:18 2021'


# from /etc/hosts on all 3 nodes:

172.31.0.31kvm03-node01 kvm03-node01.avigol-gcs.dk
172.31.0.32kvm03-node02 kvm03-node02.avigol-gcs.dk
172.31.0.33kvm03-node03 kvm03-node03.avigol-gcs.dk

On Thu, Jan 7, 2021 at 11:15 AM Klaus Wenninger  wrote:
>
> Hi Steffen,
>
> If you just see the leftover pending-action on one node
> it would be interesting if restarting of pacemaker on
> one of the other nodes does sync it to all of the
> nodes.
>
> Regards,
> Klaus
>
> On 1/7/21 9:54 AM, renayama19661...@ybb.ne.jp wrote:
> > Hi Steffen,
> >
> >> Unfortunately not sure about the exact scenario. But I have been doing
> >> some recent experiments with node standby/unstandby stop/start. This
> >> to get procedures right for updating node rpms etc.
> >>
> >> Later I noticed the uncomforting "pending fencing actions" status msg.
> > Okay!
> >
> > Repeat the standby and unstandby steps in the same way to check.
> > We will start checking after tomorrow, so I think it will take some time 
> > until next week.
> >
> >
> > Many thanks,
> > Hideo Yamauchi.
> >
> >
> >
> > - Original Message -
> >> From: "renayama19661...@ybb.ne.jp" 
> >> To: Reid Wahl ; Cluster Labs - All topics related to 
> >> open-source clustering welcomed 
> >> Cc:
> >> Date: 2021/1/7, Thu 17:51
> >> Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
> >>
> >> Hi Steffen,
> >> Hi Reid,
> >>
> >> The fencing histo

Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-07 Thread Klaus Wenninger
Hi Steffen,

If you just see the leftover pending-action on one node
it would be interesting if restarting of pacemaker on
one of the other nodes does sync it to all of the
nodes.

Regards,
Klaus

On 1/7/21 9:54 AM, renayama19661...@ybb.ne.jp wrote:
> Hi Steffen,
>
>> Unfortunately not sure about the exact scenario. But I have been doing
>> some recent experiments with node standby/unstandby stop/start. This
>> to get procedures right for updating node rpms etc.
>>  
>> Later I noticed the uncomforting "pending fencing actions" status msg.
> Okay!
>
> Repeat the standby and unstandby steps in the same way to check.
> We will start checking after tomorrow, so I think it will take some time 
> until next week.
>
>
> Many thanks,
> Hideo Yamauchi.
>
>
>
> - Original Message -
>> From: "renayama19661...@ybb.ne.jp" 
>> To: Reid Wahl ; Cluster Labs - All topics related to 
>> open-source clustering welcomed 
>> Cc: 
>> Date: 2021/1/7, Thu 17:51
>> Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
>>
>> Hi Steffen,
>> Hi Reid,
>>
>> The fencing history is kept inside stonith-ng and is not written to cib.
>> However, getting the entire cib and getting it sent will help you to 
>> reproduce 
>> the problem.
>>
>> Best Regards,
>> Hideo Yamauchi.
>>
>>
>> - Original Message -
>>> From: Reid Wahl 
>>> To: renayama19661...@ybb.ne.jp; Cluster Labs - All topics related to 
>> open-source clustering welcomed  
>>> Date: 2021/1/7, Thu 17:39
>>> Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
>>>
>>>
>>> Hi, Steffen. Those attachments don't contain the CIB. They contain the 
>> `pcs config` output. You can get the cib with `pcs cluster cib > 
>> $(hostname).cib.xml`.
>>>
>>> Granted, it's possible that this fence action information wouldn't 
>> be in the CIB at all. It might be stored in fencer memory.
>>>
>>> On Thu, Jan 7, 2021 at 12:26 AM  wrote:
>>>
>>> Hi Steffen,
>>>>>  Here CIB settings attached (pcs config show) for all 3 of my nodes
>>>>>  (all 3 seems 100% identical), node03 is the DC.
>>>>
>>>> Thank you for the attachment.
>>>>
>>>> What is the scenario when this situation occurs?
>>>> In what steps did the problem appear when fencing was performed (or 
>> failed)?
>>>>
>>>> Best Regards,
>>>> Hideo Yamauchi.
>>>>
>>>>
>>>> - Original Message -
>>>>>  From: Steffen Vinther Sørensen 
>>>>>  To: renayama19661...@ybb.ne.jp; Cluster Labs - All topics related 
>> to open-source clustering welcomed 
>>>>>  Cc: 
>>>>>  Date: 2021/1/7, Thu 17:05
>>>>>  Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs 
>> status
>>>>>  Hi Hideo,
>>>>>
>>>>>  Here CIB settings attached (pcs config show) for all 3 of my nodes
>>>>>  (all 3 seems 100% identical), node03 is the DC.
>>>>>
>>>>>  Regards
>>>>>  Steffen
>>>>>
>>>>>  On Thu, Jan 7, 2021 at 8:06 AM  
>> wrote:
>>>>>>   Hi Steffen,
>>>>>>   Hi Reid,
>>>>>>
>>>>>>   I also checked the Centos source rpm and it seems to include a 
>> fix for the 
>>>>>  problem.
>>>>>>   As Steffen suggested, if you share your CIB settings, I might 
>> know 
>>>>>  something.
>>>>>>   If this issue is the same as the fix, the display will only be 
>> displayed on 
>>>>>  the DC node and will not affect the operation.
>>>>>>   The pending actions shown will remain for a long time, but 
>> will not have a 
>>>>>  negative impact on the cluster.
>>>>>>   Best Regards,
>>>>>>   Hideo Yamauchi.
>>>>>>
>>>>>>
>>>>>>   - Original Message -
>>>>>>   > From: Reid Wahl 
>>>>>>   > To: Cluster Labs - All topics related to open-source 
>> clustering 
>>>>>  welcomed 
>>>>>>   > Cc:
>>>>>>   > Date: 2021/1/7, Thu 15:58
>>>>>>   > Subject: Re: [ClusterLabs] Pending Fencing Actions shown 
>> in pcs status
>>>>>>   >
>>>>>>

Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-07 Thread renayama19661014
Hi Steffen,

> Unfortunately not sure about the exact scenario. But I have been doing
> some recent experiments with node standby/unstandby stop/start. This
> to get procedures right for updating node rpms etc.
> 
> Later I noticed the uncomforting "pending fencing actions" status msg.

Okay!

Repeat the standby and unstandby steps in the same way to check.
We will start checking after tomorrow, so I think it will take some time until 
next week.


Many thanks,
Hideo Yamauchi.



- Original Message -
> From: "renayama19661...@ybb.ne.jp" 
> To: Reid Wahl ; Cluster Labs - All topics related to 
> open-source clustering welcomed 
> Cc: 
> Date: 2021/1/7, Thu 17:51
> Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
> 
> Hi Steffen,
> Hi Reid,
> 
> The fencing history is kept inside stonith-ng and is not written to cib.
> However, getting the entire cib and getting it sent will help you to 
> reproduce 
> the problem.
> 
> Best Regards,
> Hideo Yamauchi.
> 
> 
> - Original Message -
>> From: Reid Wahl 
>> To: renayama19661...@ybb.ne.jp; Cluster Labs - All topics related to 
> open-source clustering welcomed  
>> Date: 2021/1/7, Thu 17:39
>> Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
>> 
>> 
>> Hi, Steffen. Those attachments don't contain the CIB. They contain the 
> `pcs config` output. You can get the cib with `pcs cluster cib > 
> $(hostname).cib.xml`.
>> 
>> 
>> Granted, it's possible that this fence action information wouldn't 
> be in the CIB at all. It might be stored in fencer memory.
>> 
>> 
>> On Thu, Jan 7, 2021 at 12:26 AM  wrote:
>> 
>> Hi Steffen,
>>> 
>>>>  Here CIB settings attached (pcs config show) for all 3 of my nodes
>>>>  (all 3 seems 100% identical), node03 is the DC.
>>> 
>>> 
>>> Thank you for the attachment.
>>> 
>>> What is the scenario when this situation occurs?
>>> In what steps did the problem appear when fencing was performed (or 
> failed)?
>>> 
>>> 
>>> Best Regards,
>>> Hideo Yamauchi.
>>> 
>>> 
>>> - Original Message -
>>>>  From: Steffen Vinther Sørensen 
>>>>  To: renayama19661...@ybb.ne.jp; Cluster Labs - All topics related 
> to open-source clustering welcomed 
>>>>  Cc: 
>>>>  Date: 2021/1/7, Thu 17:05
>>>>  Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs 
> status
>>>> 
>>>>  Hi Hideo,
>>>> 
>>>>  Here CIB settings attached (pcs config show) for all 3 of my nodes
>>>>  (all 3 seems 100% identical), node03 is the DC.
>>>> 
>>>>  Regards
>>>>  Steffen
>>>> 
>>>>  On Thu, Jan 7, 2021 at 8:06 AM  
> wrote:
>>>>> 
>>>>>   Hi Steffen,
>>>>>   Hi Reid,
>>>>> 
>>>>>   I also checked the Centos source rpm and it seems to include a 
> fix for the 
>>>>  problem.
>>>>> 
>>>>>   As Steffen suggested, if you share your CIB settings, I might 
> know 
>>>>  something.
>>>>> 
>>>>>   If this issue is the same as the fix, the display will only be 
> displayed on 
>>>>  the DC node and will not affect the operation.
>>>>>   The pending actions shown will remain for a long time, but 
> will not have a 
>>>>  negative impact on the cluster.
>>>>> 
>>>>>   Best Regards,
>>>>>   Hideo Yamauchi.
>>>>> 
>>>>> 
>>>>>   - Original Message -
>>>>>   > From: Reid Wahl 
>>>>>   > To: Cluster Labs - All topics related to open-source 
> clustering 
>>>>  welcomed 
>>>>>   > Cc:
>>>>>   > Date: 2021/1/7, Thu 15:58
>>>>>   > Subject: Re: [ClusterLabs] Pending Fencing Actions shown 
> in pcs status
>>>>>   >
>>>>>   > It's supposedly fixed in that version.
>>>>>   >   - https://bugzilla.redhat.com/show_bug.cgi?id=1787749 
>>>>>   >   - https://access.redhat.com/solutions/4713471 
>>>>>   >
>>>>>   > So you may be hitting a different issue (unless 
> there's a bug in 
>>>>  the
>>>>>   > pcmk 1.1 backport of the fix).
>>>>>   >
>>>>>   > I may be a little bit out of my area of knowledge here, 

Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-07 Thread renayama19661014
Hi Steffen,
Hi Reid,

The fencing history is kept inside stonith-ng and is not written to cib.
However, getting the entire cib and getting it sent will help you to reproduce 
the problem.

Best Regards,
Hideo Yamauchi.


- Original Message -
>From: Reid Wahl 
>To: renayama19661...@ybb.ne.jp; Cluster Labs - All topics related to 
>open-source clustering welcomed  
>Date: 2021/1/7, Thu 17:39
>Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
> 
>
>Hi, Steffen. Those attachments don't contain the CIB. They contain the `pcs 
>config` output. You can get the cib with `pcs cluster cib > 
>$(hostname).cib.xml`.
>
>
>Granted, it's possible that this fence action information wouldn't be in the 
>CIB at all. It might be stored in fencer memory.
>
>
>On Thu, Jan 7, 2021 at 12:26 AM  wrote:
>
>Hi Steffen,
>>
>>> Here CIB settings attached (pcs config show) for all 3 of my nodes
>>> (all 3 seems 100% identical), node03 is the DC.
>>
>>
>>Thank you for the attachment.
>>
>>What is the scenario when this situation occurs?
>>In what steps did the problem appear when fencing was performed (or failed)?
>>
>>
>>Best Regards,
>>Hideo Yamauchi.
>>
>>
>>- Original Message -
>>> From: Steffen Vinther Sørensen 
>>> To: renayama19661...@ybb.ne.jp; Cluster Labs - All topics related to 
>>> open-source clustering welcomed 
>>> Cc: 
>>> Date: 2021/1/7, Thu 17:05
>>> Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
>>> 
>>> Hi Hideo,
>>> 
>>> Here CIB settings attached (pcs config show) for all 3 of my nodes
>>> (all 3 seems 100% identical), node03 is the DC.
>>> 
>>> Regards
>>> Steffen
>>> 
>>> On Thu, Jan 7, 2021 at 8:06 AM  wrote:
>>>> 
>>>>  Hi Steffen,
>>>>  Hi Reid,
>>>> 
>>>>  I also checked the Centos source rpm and it seems to include a fix for 
>>>>the 
>>> problem.
>>>> 
>>>>  As Steffen suggested, if you share your CIB settings, I might know 
>>> something.
>>>> 
>>>>  If this issue is the same as the fix, the display will only be displayed 
>>>>on 
>>> the DC node and will not affect the operation.
>>>>  The pending actions shown will remain for a long time, but will not have 
>>>>a 
>>> negative impact on the cluster.
>>>> 
>>>>  Best Regards,
>>>>  Hideo Yamauchi.
>>>> 
>>>> 
>>>>  - Original Message -
>>>>  > From: Reid Wahl 
>>>>  > To: Cluster Labs - All topics related to open-source clustering 
>>> welcomed 
>>>>  > Cc:
>>>>  > Date: 2021/1/7, Thu 15:58
>>>>  > Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
>>>>  >
>>>>  > It's supposedly fixed in that version.
>>>>  >   - https://bugzilla.redhat.com/show_bug.cgi?id=1787749
>>>>  >   - https://access.redhat.com/solutions/4713471
>>>>  >
>>>>  > So you may be hitting a different issue (unless there's a bug in 
>>> the
>>>>  > pcmk 1.1 backport of the fix).
>>>>  >
>>>>  > I may be a little bit out of my area of knowledge here, but can you
>>>>  > share the CIBs from nodes 1 and 3? Maybe Hideo, Klaus, or Ken has some
>>>>  > insight.
>>>>  >
>>>>  > On Wed, Jan 6, 2021 at 10:53 PM Steffen Vinther Sørensen
>>>>  >  wrote:
>>>>  >>
>>>>  >>  Hi Hideo,
>>>>  >>
>>>>  >>  If the fix is not going to make it into the CentOS7 pacemaker 
>>> version,
>>>>  >>  I guess the stable approach to take advantage of it is to build 
>>> the
>>>>  >>  cluster on another OS than CentOS7 ? A little late for that in 
>>> this
>>>>  >>  case though :)
>>>>  >>
>>>>  >>  Regards
>>>>  >>  Steffen
>>>>  >>
>>>>  >>
>>>>  >>
>>>>  >>
>>>>  >>  On Thu, Jan 7, 2021 at 7:27 AM  
>>> wrote:
>>>>  >>  >
>>>>  >>  > Hi Steffen,
>>>>  >>  >
>>>>  >>  > The fix pointed out by Reid is affecting it.
>>>>  >>  >
>>>>  &g

Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-07 Thread Reid Wahl
Hi, Steffen. Those attachments don't contain the CIB. They contain the `pcs
config` output. You can get the cib with `pcs cluster cib >
$(hostname).cib.xml`.

Granted, it's possible that this fence action information wouldn't be in
the CIB at all. It might be stored in fencer memory.

On Thu, Jan 7, 2021 at 12:26 AM  wrote:

> Hi Steffen,
>
> > Here CIB settings attached (pcs config show) for all 3 of my nodes
> > (all 3 seems 100% identical), node03 is the DC.
>
>
> Thank you for the attachment.
>
> What is the scenario when this situation occurs?
> In what steps did the problem appear when fencing was performed (or
> failed)?
>
>
> Best Regards,
> Hideo Yamauchi.
>
>
> - Original Message -
> > From: Steffen Vinther Sørensen 
> > To: renayama19661...@ybb.ne.jp; Cluster Labs - All topics related to
> open-source clustering welcomed 
> > Cc:
> > Date: 2021/1/7, Thu 17:05
> > Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
> >
> > Hi Hideo,
> >
> > Here CIB settings attached (pcs config show) for all 3 of my nodes
> > (all 3 seems 100% identical), node03 is the DC.
> >
> > Regards
> > Steffen
> >
> > On Thu, Jan 7, 2021 at 8:06 AM  wrote:
> >>
> >>  Hi Steffen,
> >>  Hi Reid,
> >>
> >>  I also checked the Centos source rpm and it seems to include a fix for
> the
> > problem.
> >>
> >>  As Steffen suggested, if you share your CIB settings, I might know
> > something.
> >>
> >>  If this issue is the same as the fix, the display will only be
> displayed on
> > the DC node and will not affect the operation.
> >>  The pending actions shown will remain for a long time, but will not
> have a
> > negative impact on the cluster.
> >>
> >>  Best Regards,
> >>  Hideo Yamauchi.
> >>
> >>
> >>  - Original Message -
> >>  > From: Reid Wahl 
> >>  > To: Cluster Labs - All topics related to open-source clustering
> > welcomed 
> >>  > Cc:
> >>  > Date: 2021/1/7, Thu 15:58
> >>  > Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs
> status
> >>  >
> >>  > It's supposedly fixed in that version.
> >>  >   - https://bugzilla.redhat.com/show_bug.cgi?id=1787749
> >>  >   - https://access.redhat.com/solutions/4713471
> >>  >
> >>  > So you may be hitting a different issue (unless there's a bug in
> > the
> >>  > pcmk 1.1 backport of the fix).
> >>  >
> >>  > I may be a little bit out of my area of knowledge here, but can you
> >>  > share the CIBs from nodes 1 and 3? Maybe Hideo, Klaus, or Ken has
> some
> >>  > insight.
> >>  >
> >>  > On Wed, Jan 6, 2021 at 10:53 PM Steffen Vinther Sørensen
> >>  >  wrote:
> >>  >>
> >>  >>  Hi Hideo,
> >>  >>
> >>  >>  If the fix is not going to make it into the CentOS7 pacemaker
> > version,
> >>  >>  I guess the stable approach to take advantage of it is to build
> > the
> >>  >>  cluster on another OS than CentOS7 ? A little late for that in
> > this
> >>  >>  case though :)
> >>  >>
> >>  >>  Regards
> >>  >>  Steffen
> >>  >>
> >>  >>
> >>  >>
> >>  >>
> >>  >>  On Thu, Jan 7, 2021 at 7:27 AM 
> > wrote:
> >>  >>  >
> >>  >>  > Hi Steffen,
> >>  >>  >
> >>  >>  > The fix pointed out by Reid is affecting it.
> >>  >>  >
> >>  >>  > Since the fencing action requested by the DC node exists
> > only in the
> >>  > DC node, such an event occurs.
> >>  >>  > You will need to take advantage of the modified pacemaker to
> > resolve
> >>  > the issue.
> >>  >>  >
> >>  >>  > Best Regards,
> >>  >>  > Hideo Yamauchi.
> >>  >>  >
> >>  >>  >
> >>  >>  >
> >>  >>  > - Original Message -
> >>  >>  > > From: Reid Wahl 
> >>  >>  > > To: Cluster Labs - All topics related to open-source
> > clustering
> >>  > welcomed 
> >>  >>  > > Cc:
> >>  >>  > > Date: 2021/1/7, Thu 15:07
> >>  >>  > > Su

Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-07 Thread renayama19661014
Hi Steffen,

> Here CIB settings attached (pcs config show) for all 3 of my nodes
> (all 3 seems 100% identical), node03 is the DC.


Thank you for the attachment.

What is the scenario when this situation occurs?
In what steps did the problem appear when fencing was performed (or failed)?


Best Regards,
Hideo Yamauchi.


- Original Message -
> From: Steffen Vinther Sørensen 
> To: renayama19661...@ybb.ne.jp; Cluster Labs - All topics related to 
> open-source clustering welcomed 
> Cc: 
> Date: 2021/1/7, Thu 17:05
> Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
> 
> Hi Hideo,
> 
> Here CIB settings attached (pcs config show) for all 3 of my nodes
> (all 3 seems 100% identical), node03 is the DC.
> 
> Regards
> Steffen
> 
> On Thu, Jan 7, 2021 at 8:06 AM  wrote:
>> 
>>  Hi Steffen,
>>  Hi Reid,
>> 
>>  I also checked the Centos source rpm and it seems to include a fix for the 
> problem.
>> 
>>  As Steffen suggested, if you share your CIB settings, I might know 
> something.
>> 
>>  If this issue is the same as the fix, the display will only be displayed on 
> the DC node and will not affect the operation.
>>  The pending actions shown will remain for a long time, but will not have a 
> negative impact on the cluster.
>> 
>>  Best Regards,
>>  Hideo Yamauchi.
>> 
>> 
>>  - Original Message -
>>  > From: Reid Wahl 
>>  > To: Cluster Labs - All topics related to open-source clustering 
> welcomed 
>>  > Cc:
>>  > Date: 2021/1/7, Thu 15:58
>>  > Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
>>  >
>>  > It's supposedly fixed in that version.
>>  >   - https://bugzilla.redhat.com/show_bug.cgi?id=1787749 
>>  >   - https://access.redhat.com/solutions/4713471 
>>  >
>>  > So you may be hitting a different issue (unless there's a bug in 
> the
>>  > pcmk 1.1 backport of the fix).
>>  >
>>  > I may be a little bit out of my area of knowledge here, but can you
>>  > share the CIBs from nodes 1 and 3? Maybe Hideo, Klaus, or Ken has some
>>  > insight.
>>  >
>>  > On Wed, Jan 6, 2021 at 10:53 PM Steffen Vinther Sørensen
>>  >  wrote:
>>  >>
>>  >>  Hi Hideo,
>>  >>
>>  >>  If the fix is not going to make it into the CentOS7 pacemaker 
> version,
>>  >>  I guess the stable approach to take advantage of it is to build 
> the
>>  >>  cluster on another OS than CentOS7 ? A little late for that in 
> this
>>  >>  case though :)
>>  >>
>>  >>  Regards
>>  >>  Steffen
>>  >>
>>  >>
>>  >>
>>  >>
>>  >>  On Thu, Jan 7, 2021 at 7:27 AM  
> wrote:
>>  >>  >
>>  >>  > Hi Steffen,
>>  >>  >
>>  >>  > The fix pointed out by Reid is affecting it.
>>  >>  >
>>  >>  > Since the fencing action requested by the DC node exists 
> only in the
>>  > DC node, such an event occurs.
>>  >>  > You will need to take advantage of the modified pacemaker to 
> resolve
>>  > the issue.
>>  >>  >
>>  >>  > Best Regards,
>>  >>  > Hideo Yamauchi.
>>  >>  >
>>  >>  >
>>  >>  >
>>  >>  > - Original Message -
>>  >>  > > From: Reid Wahl 
>>  >>  > > To: Cluster Labs - All topics related to open-source 
> clustering
>>  > welcomed 
>>  >>  > > Cc:
>>  >>  > > Date: 2021/1/7, Thu 15:07
>>  >>  > > Subject: Re: [ClusterLabs] Pending Fencing Actions 
> shown in pcs
>>  > status
>>  >>  > >
>>  >>  > > Hi, Steffen. Are your cluster nodes all running the 
> same
>>  > Pacemaker
>>  >>  > > versions? This looks like Bug 5401[1], which is fixed 
> by upstream
>>  >>  > > commit df71a07[2]. I'm a little bit confused about 
> why it
>>  > only shows
>>  >>  > > up on one out of three nodes though.
>>  >>  > >
>>  >>  > > [1] https://bugs.clusterlabs.org/show_bug.cgi?id=5401 
>>  >>  > > [2] 
> https://github.com/ClusterLabs/pacemaker/commit/df71a07 
>>  >>  > >
>>  >>  > > On Tue, Jan 5, 2021 at 8:31 AM Steffen Vinther Sørensen
>>  >>  > >  wrote:
>>  >>  > >>
>>  &g

Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-07 Thread Steffen Vinther Sørensen
Hi Hideo,

Here CIB settings attached (pcs config show) for all 3 of my nodes
(all 3 seems 100% identical), node03 is the DC.

Regards
Steffen

On Thu, Jan 7, 2021 at 8:06 AM  wrote:
>
> Hi Steffen,
> Hi Reid,
>
> I also checked the Centos source rpm and it seems to include a fix for the 
> problem.
>
> As Steffen suggested, if you share your CIB settings, I might know something.
>
> If this issue is the same as the fix, the display will only be displayed on 
> the DC node and will not affect the operation.
> The pending actions shown will remain for a long time, but will not have a 
> negative impact on the cluster.
>
> Best Regards,
> Hideo Yamauchi.
>
>
> - Original Message -
> > From: Reid Wahl 
> > To: Cluster Labs - All topics related to open-source clustering welcomed 
> > 
> > Cc:
> > Date: 2021/1/7, Thu 15:58
> > Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
> >
> > It's supposedly fixed in that version.
> >   - https://bugzilla.redhat.com/show_bug.cgi?id=1787749
> >   - https://access.redhat.com/solutions/4713471
> >
> > So you may be hitting a different issue (unless there's a bug in the
> > pcmk 1.1 backport of the fix).
> >
> > I may be a little bit out of my area of knowledge here, but can you
> > share the CIBs from nodes 1 and 3? Maybe Hideo, Klaus, or Ken has some
> > insight.
> >
> > On Wed, Jan 6, 2021 at 10:53 PM Steffen Vinther Sørensen
> >  wrote:
> >>
> >>  Hi Hideo,
> >>
> >>  If the fix is not going to make it into the CentOS7 pacemaker version,
> >>  I guess the stable approach to take advantage of it is to build the
> >>  cluster on another OS than CentOS7 ? A little late for that in this
> >>  case though :)
> >>
> >>  Regards
> >>  Steffen
> >>
> >>
> >>
> >>
> >>  On Thu, Jan 7, 2021 at 7:27 AM  wrote:
> >>  >
> >>  > Hi Steffen,
> >>  >
> >>  > The fix pointed out by Reid is affecting it.
> >>  >
> >>  > Since the fencing action requested by the DC node exists only in the
> > DC node, such an event occurs.
> >>  > You will need to take advantage of the modified pacemaker to resolve
> > the issue.
> >>  >
> >>  > Best Regards,
> >>  > Hideo Yamauchi.
> >>  >
> >>  >
> >>  >
> >>  > - Original Message -
> >>  > > From: Reid Wahl 
> >>  > > To: Cluster Labs - All topics related to open-source clustering
> > welcomed 
> >>  > > Cc:
> >>  > > Date: 2021/1/7, Thu 15:07
> >>  > > Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs
> > status
> >>  > >
> >>  > > Hi, Steffen. Are your cluster nodes all running the same
> > Pacemaker
> >>  > > versions? This looks like Bug 5401[1], which is fixed by upstream
> >>  > > commit df71a07[2]. I'm a little bit confused about why it
> > only shows
> >>  > > up on one out of three nodes though.
> >>  > >
> >>  > > [1] https://bugs.clusterlabs.org/show_bug.cgi?id=5401
> >>  > > [2] https://github.com/ClusterLabs/pacemaker/commit/df71a07
> >>  > >
> >>  > > On Tue, Jan 5, 2021 at 8:31 AM Steffen Vinther Sørensen
> >>  > >  wrote:
> >>  > >>
> >>  > >>  Hello
> >>  > >>
> >>  > >>  node 1 is showing this in 'pcs status'
> >>  > >>
> >>  > >>  Pending Fencing Actions:
> >>  > >>  * reboot of kvm03-node02.avigol-gcs.dk pending:
> > client=crmd.37819,
> >>  > >>  origin=kvm03-node03.avigol-gcs.dk
> >>  > >>
> >>  > >>  node 2 and node 3 outputs no such thing (node 3 is DC)
> >>  > >>
> >>  > >>  Google is not much help, how to investigate this further and
> > get rid
> >>  > >>  of such terrifying status message ?
> >>  > >>
> >>  > >>  Regards
> >>  > >>  Steffen
> >>  > >>  ___
> >>  > >>  Manage your subscription:
> >>  > >>  https://lists.clusterlabs.org/mailman/listinfo/users
> >>  > >>
> >>  > >>  ClusterLabs home: https://www.clusterlabs.org/
> >>  > >>
> >>  > >
> >>  > >
>

Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-06 Thread renayama19661014
Hi Reid,
Hi Steffen,



> According to Steffen's description, the "pending" is displayed 
> only on
> node 1, while the DC is node 3. That's another thing that makes me
> wonder if this is a distinct issue.


The problem may not be the same.
I think it's a good idea to have bugzilla or ML provide a crm_report etc. to 
investigate the problem.


Best Regard,
Hideo Yamauchi.


- Original Message -
> From: Reid Wahl 
> To: renayama19661...@ybb.ne.jp; Cluster Labs - All topics related to 
> open-source clustering welcomed 
> Cc: 
> Date: 2021/1/7, Thu 16:16
> Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
> 
> On Wed, Jan 6, 2021 at 11:07 PM  wrote:
>> 
>>  Hi Steffen,
>>  Hi Reid,
>> 
>>  I also checked the Centos source rpm and it seems to include a fix for the 
> problem.
>> 
>>  As Steffen suggested, if you share your CIB settings, I might know 
> something.
>> 
>>  If this issue is the same as the fix, the display will only be displayed on 
> the DC node and will not affect the operation.
> 
> According to Steffen's description, the "pending" is displayed 
> only on
> node 1, while the DC is node 3. That's another thing that makes me
> wonder if this is a distinct issue.
> 
>>  The pending actions shown will remain for a long time, but will not have a 
> negative impact on the cluster.
>> 
>>  Best Regards,
>>  Hideo Yamauchi.
>> 
>> 
>>  - Original Message -----
>>  > From: Reid Wahl 
>>  > To: Cluster Labs - All topics related to open-source clustering 
> welcomed 
>>  > Cc:
>>  > Date: 2021/1/7, Thu 15:58
>>  > Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
>>  >
>>  > It's supposedly fixed in that version.
>>  >   - https://bugzilla.redhat.com/show_bug.cgi?id=1787749 
>>  >   - https://access.redhat.com/solutions/4713471 
>>  >
>>  > So you may be hitting a different issue (unless there's a bug in 
> the
>>  > pcmk 1.1 backport of the fix).
>>  >
>>  > I may be a little bit out of my area of knowledge here, but can you
>>  > share the CIBs from nodes 1 and 3? Maybe Hideo, Klaus, or Ken has some
>>  > insight.
>>  >
>>  > On Wed, Jan 6, 2021 at 10:53 PM Steffen Vinther Sørensen
>>  >  wrote:
>>  >>
>>  >>  Hi Hideo,
>>  >>
>>  >>  If the fix is not going to make it into the CentOS7 pacemaker 
> version,
>>  >>  I guess the stable approach to take advantage of it is to build 
> the
>>  >>  cluster on another OS than CentOS7 ? A little late for that in 
> this
>>  >>  case though :)
>>  >>
>>  >>  Regards
>>  >>  Steffen
>>  >>
>>  >>
>>  >>
>>  >>
>>  >>  On Thu, Jan 7, 2021 at 7:27 AM  
> wrote:
>>  >>  >
>>  >>  > Hi Steffen,
>>  >>  >
>>  >>  > The fix pointed out by Reid is affecting it.
>>  >>  >
>>  >>  > Since the fencing action requested by the DC node exists 
> only in the
>>  > DC node, such an event occurs.
>>  >>  > You will need to take advantage of the modified pacemaker to 
> resolve
>>  > the issue.
>>  >>  >
>>  >>  > Best Regards,
>>  >>  > Hideo Yamauchi.
>>  >>  >
>>  >>  >
>>  >>  >
>>  >>  > - Original Message -
>>  >>  > > From: Reid Wahl 
>>  >>  > > To: Cluster Labs - All topics related to open-source 
> clustering
>>  > welcomed 
>>  >>  > > Cc:
>>  >>  > > Date: 2021/1/7, Thu 15:07
>>  >>  > > Subject: Re: [ClusterLabs] Pending Fencing Actions 
> shown in pcs
>>  > status
>>  >>  > >
>>  >>  > > Hi, Steffen. Are your cluster nodes all running the 
> same
>>  > Pacemaker
>>  >>  > > versions? This looks like Bug 5401[1], which is fixed 
> by upstream
>>  >>  > > commit df71a07[2]. I'm a little bit confused about 
> why it
>>  > only shows
>>  >>  > > up on one out of three nodes though.
>>  >>  > >
>>  >>  > > [1] https://bugs.clusterlabs.org/show_bug.cgi?id=5401 
>>  >>  > > [2] 
> https://github.com/ClusterLabs/pacemaker/commit/df71a07 
>>  >>  > >
>>  >>  > > On Tue, Jan 5, 2021 at 8

Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-06 Thread Reid Wahl
On Wed, Jan 6, 2021 at 11:07 PM  wrote:
>
> Hi Steffen,
> Hi Reid,
>
> I also checked the Centos source rpm and it seems to include a fix for the 
> problem.
>
> As Steffen suggested, if you share your CIB settings, I might know something.
>
> If this issue is the same as the fix, the display will only be displayed on 
> the DC node and will not affect the operation.

According to Steffen's description, the "pending" is displayed only on
node 1, while the DC is node 3. That's another thing that makes me
wonder if this is a distinct issue.

> The pending actions shown will remain for a long time, but will not have a 
> negative impact on the cluster.
>
> Best Regards,
> Hideo Yamauchi.
>
>
> - Original Message -
> > From: Reid Wahl 
> > To: Cluster Labs - All topics related to open-source clustering welcomed 
> > 
> > Cc:
> > Date: 2021/1/7, Thu 15:58
> > Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
> >
> > It's supposedly fixed in that version.
> >   - https://bugzilla.redhat.com/show_bug.cgi?id=1787749
> >   - https://access.redhat.com/solutions/4713471
> >
> > So you may be hitting a different issue (unless there's a bug in the
> > pcmk 1.1 backport of the fix).
> >
> > I may be a little bit out of my area of knowledge here, but can you
> > share the CIBs from nodes 1 and 3? Maybe Hideo, Klaus, or Ken has some
> > insight.
> >
> > On Wed, Jan 6, 2021 at 10:53 PM Steffen Vinther Sørensen
> >  wrote:
> >>
> >>  Hi Hideo,
> >>
> >>  If the fix is not going to make it into the CentOS7 pacemaker version,
> >>  I guess the stable approach to take advantage of it is to build the
> >>  cluster on another OS than CentOS7 ? A little late for that in this
> >>  case though :)
> >>
> >>  Regards
> >>  Steffen
> >>
> >>
> >>
> >>
> >>  On Thu, Jan 7, 2021 at 7:27 AM  wrote:
> >>  >
> >>  > Hi Steffen,
> >>  >
> >>  > The fix pointed out by Reid is affecting it.
> >>  >
> >>  > Since the fencing action requested by the DC node exists only in the
> > DC node, such an event occurs.
> >>  > You will need to take advantage of the modified pacemaker to resolve
> > the issue.
> >>  >
> >>  > Best Regards,
> >>  > Hideo Yamauchi.
> >>  >
> >>  >
> >>  >
> >>  > - Original Message -
> >>  > > From: Reid Wahl 
> >>  > > To: Cluster Labs - All topics related to open-source clustering
> > welcomed 
> >>  > > Cc:
> >>  > > Date: 2021/1/7, Thu 15:07
> >>  > > Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs
> > status
> >>  > >
> >>  > > Hi, Steffen. Are your cluster nodes all running the same
> > Pacemaker
> >>  > > versions? This looks like Bug 5401[1], which is fixed by upstream
> >>  > > commit df71a07[2]. I'm a little bit confused about why it
> > only shows
> >>  > > up on one out of three nodes though.
> >>  > >
> >>  > > [1] https://bugs.clusterlabs.org/show_bug.cgi?id=5401
> >>  > > [2] https://github.com/ClusterLabs/pacemaker/commit/df71a07
> >>  > >
> >>  > > On Tue, Jan 5, 2021 at 8:31 AM Steffen Vinther Sørensen
> >>  > >  wrote:
> >>  > >>
> >>  > >>  Hello
> >>  > >>
> >>  > >>  node 1 is showing this in 'pcs status'
> >>  > >>
> >>  > >>  Pending Fencing Actions:
> >>  > >>  * reboot of kvm03-node02.avigol-gcs.dk pending:
> > client=crmd.37819,
> >>  > >>  origin=kvm03-node03.avigol-gcs.dk
> >>  > >>
> >>  > >>  node 2 and node 3 outputs no such thing (node 3 is DC)
> >>  > >>
> >>  > >>  Google is not much help, how to investigate this further and
> > get rid
> >>  > >>  of such terrifying status message ?
> >>  > >>
> >>  > >>  Regards
> >>  > >>  Steffen
> >>  > >>  ___
> >>  > >>  Manage your subscription:
> >>  > >>  https://lists.clusterlabs.org/mailman/listinfo/users
> >>  > >>
> >>  > >>  ClusterLabs home: https://www.clusterlabs.org/
> >>  > >>
> >

Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-06 Thread renayama19661014
Hi Steffen,
Hi Reid,

I also checked the Centos source rpm and it seems to include a fix for the 
problem.

As Steffen suggested, if you share your CIB settings, I might know something.

If this issue is the same as the fix, the display will only be displayed on the 
DC node and will not affect the operation.
The pending actions shown will remain for a long time, but will not have a 
negative impact on the cluster.

Best Regards,
Hideo Yamauchi.


- Original Message -
> From: Reid Wahl 
> To: Cluster Labs - All topics related to open-source clustering welcomed 
> 
> Cc: 
> Date: 2021/1/7, Thu 15:58
> Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
> 
> It's supposedly fixed in that version.
>   - https://bugzilla.redhat.com/show_bug.cgi?id=1787749 
>   - https://access.redhat.com/solutions/4713471 
> 
> So you may be hitting a different issue (unless there's a bug in the
> pcmk 1.1 backport of the fix).
> 
> I may be a little bit out of my area of knowledge here, but can you
> share the CIBs from nodes 1 and 3? Maybe Hideo, Klaus, or Ken has some
> insight.
> 
> On Wed, Jan 6, 2021 at 10:53 PM Steffen Vinther Sørensen
>  wrote:
>> 
>>  Hi Hideo,
>> 
>>  If the fix is not going to make it into the CentOS7 pacemaker version,
>>  I guess the stable approach to take advantage of it is to build the
>>  cluster on another OS than CentOS7 ? A little late for that in this
>>  case though :)
>> 
>>  Regards
>>  Steffen
>> 
>> 
>> 
>> 
>>  On Thu, Jan 7, 2021 at 7:27 AM  wrote:
>>  >
>>  > Hi Steffen,
>>  >
>>  > The fix pointed out by Reid is affecting it.
>>  >
>>  > Since the fencing action requested by the DC node exists only in the 
> DC node, such an event occurs.
>>  > You will need to take advantage of the modified pacemaker to resolve 
> the issue.
>>  >
>>  > Best Regards,
>>  > Hideo Yamauchi.
>>  >
>>  >
>>  >
>>  > - Original Message -
>>  > > From: Reid Wahl 
>>  > > To: Cluster Labs - All topics related to open-source clustering 
> welcomed 
>>  > > Cc:
>>  > > Date: 2021/1/7, Thu 15:07
>>  > > Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs 
> status
>>  > >
>>  > > Hi, Steffen. Are your cluster nodes all running the same 
> Pacemaker
>>  > > versions? This looks like Bug 5401[1], which is fixed by upstream
>>  > > commit df71a07[2]. I'm a little bit confused about why it 
> only shows
>>  > > up on one out of three nodes though.
>>  > >
>>  > > [1] https://bugs.clusterlabs.org/show_bug.cgi?id=5401 
>>  > > [2] https://github.com/ClusterLabs/pacemaker/commit/df71a07 
>>  > >
>>  > > On Tue, Jan 5, 2021 at 8:31 AM Steffen Vinther Sørensen
>>  > >  wrote:
>>  > >>
>>  > >>  Hello
>>  > >>
>>  > >>  node 1 is showing this in 'pcs status'
>>  > >>
>>  > >>  Pending Fencing Actions:
>>  > >>  * reboot of kvm03-node02.avigol-gcs.dk pending: 
> client=crmd.37819,
>>  > >>  origin=kvm03-node03.avigol-gcs.dk
>>  > >>
>>  > >>  node 2 and node 3 outputs no such thing (node 3 is DC)
>>  > >>
>>  > >>  Google is not much help, how to investigate this further and 
> get rid
>>  > >>  of such terrifying status message ?
>>  > >>
>>  > >>  Regards
>>  > >>  Steffen
>>  > >>  ___
>>  > >>  Manage your subscription:
>>  > >>  https://lists.clusterlabs.org/mailman/listinfo/users 
>>  > >>
>>  > >>  ClusterLabs home: https://www.clusterlabs.org/ 
>>  > >>
>>  > >
>>  > >
>>  > > --
>>  > > Regards,
>>  > >
>>  > > Reid Wahl, RHCA
>>  > > Senior Software Maintenance Engineer, Red Hat
>>  > > CEE - Platform Support Delivery - ClusterHA
>>  > >
>>  > > ___
>>  > > Manage your subscription:
>>  > > https://lists.clusterlabs.org/mailman/listinfo/users 
>>  > >
>>  > > ClusterLabs home: https://www.clusterlabs.org/ 
>>  > >
>>  >
>>  > ___
>>  > Manage your subscription:
>>  > https://lists.clusterlabs.org/mailman/listinfo/users 
>>  >
>>  > ClusterLabs home: https://www.clusterlabs.org/ 
>>  ___
>>  Manage your subscription:
>>  https://lists.clusterlabs.org/mailman/listinfo/users 
>> 
>>  ClusterLabs home: https://www.clusterlabs.org/ 
> 
> 
> 
> -- 
> Regards,
> 
> Reid Wahl, RHCA
> Senior Software Maintenance Engineer, Red Hat
> CEE - Platform Support Delivery - ClusterHA
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 
> 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-06 Thread Reid Wahl
It's supposedly fixed in that version.
  - https://bugzilla.redhat.com/show_bug.cgi?id=1787749
  - https://access.redhat.com/solutions/4713471

So you may be hitting a different issue (unless there's a bug in the
pcmk 1.1 backport of the fix).

I may be a little bit out of my area of knowledge here, but can you
share the CIBs from nodes 1 and 3? Maybe Hideo, Klaus, or Ken has some
insight.

On Wed, Jan 6, 2021 at 10:53 PM Steffen Vinther Sørensen
 wrote:
>
> Hi Hideo,
>
> If the fix is not going to make it into the CentOS7 pacemaker version,
> I guess the stable approach to take advantage of it is to build the
> cluster on another OS than CentOS7 ? A little late for that in this
> case though :)
>
> Regards
> Steffen
>
>
>
>
> On Thu, Jan 7, 2021 at 7:27 AM  wrote:
> >
> > Hi Steffen,
> >
> > The fix pointed out by Reid is affecting it.
> >
> > Since the fencing action requested by the DC node exists only in the DC 
> > node, such an event occurs.
> > You will need to take advantage of the modified pacemaker to resolve the 
> > issue.
> >
> > Best Regards,
> > Hideo Yamauchi.
> >
> >
> >
> > - Original Message -
> > > From: Reid Wahl 
> > > To: Cluster Labs - All topics related to open-source clustering welcomed 
> > > 
> > > Cc:
> > > Date: 2021/1/7, Thu 15:07
> > > Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
> > >
> > > Hi, Steffen. Are your cluster nodes all running the same Pacemaker
> > > versions? This looks like Bug 5401[1], which is fixed by upstream
> > > commit df71a07[2]. I'm a little bit confused about why it only shows
> > > up on one out of three nodes though.
> > >
> > > [1] https://bugs.clusterlabs.org/show_bug.cgi?id=5401
> > > [2] https://github.com/ClusterLabs/pacemaker/commit/df71a07
> > >
> > > On Tue, Jan 5, 2021 at 8:31 AM Steffen Vinther Sørensen
> > >  wrote:
> > >>
> > >>  Hello
> > >>
> > >>  node 1 is showing this in 'pcs status'
> > >>
> > >>  Pending Fencing Actions:
> > >>  * reboot of kvm03-node02.avigol-gcs.dk pending: client=crmd.37819,
> > >>  origin=kvm03-node03.avigol-gcs.dk
> > >>
> > >>  node 2 and node 3 outputs no such thing (node 3 is DC)
> > >>
> > >>  Google is not much help, how to investigate this further and get rid
> > >>  of such terrifying status message ?
> > >>
> > >>  Regards
> > >>  Steffen
> > >>  ___
> > >>  Manage your subscription:
> > >>  https://lists.clusterlabs.org/mailman/listinfo/users
> > >>
> > >>  ClusterLabs home: https://www.clusterlabs.org/
> > >>
> > >
> > >
> > > --
> > > Regards,
> > >
> > > Reid Wahl, RHCA
> > > Senior Software Maintenance Engineer, Red Hat
> > > CEE - Platform Support Delivery - ClusterHA
> > >
> > > ___
> > > Manage your subscription:
> > > https://lists.clusterlabs.org/mailman/listinfo/users
> > >
> > > ClusterLabs home: https://www.clusterlabs.org/
> > >
> >
> > ___
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > ClusterLabs home: https://www.clusterlabs.org/
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/



-- 
Regards,

Reid Wahl, RHCA
Senior Software Maintenance Engineer, Red Hat
CEE - Platform Support Delivery - ClusterHA

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-06 Thread Steffen Vinther Sørensen
Hi Hideo,

If the fix is not going to make it into the CentOS7 pacemaker version,
I guess the stable approach to take advantage of it is to build the
cluster on another OS than CentOS7 ? A little late for that in this
case though :)

Regards
Steffen




On Thu, Jan 7, 2021 at 7:27 AM  wrote:
>
> Hi Steffen,
>
> The fix pointed out by Reid is affecting it.
>
> Since the fencing action requested by the DC node exists only in the DC node, 
> such an event occurs.
> You will need to take advantage of the modified pacemaker to resolve the 
> issue.
>
> Best Regards,
> Hideo Yamauchi.
>
>
>
> - Original Message -
> > From: Reid Wahl 
> > To: Cluster Labs - All topics related to open-source clustering welcomed 
> > 
> > Cc:
> > Date: 2021/1/7, Thu 15:07
> > Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
> >
> > Hi, Steffen. Are your cluster nodes all running the same Pacemaker
> > versions? This looks like Bug 5401[1], which is fixed by upstream
> > commit df71a07[2]. I'm a little bit confused about why it only shows
> > up on one out of three nodes though.
> >
> > [1] https://bugs.clusterlabs.org/show_bug.cgi?id=5401
> > [2] https://github.com/ClusterLabs/pacemaker/commit/df71a07
> >
> > On Tue, Jan 5, 2021 at 8:31 AM Steffen Vinther Sørensen
> >  wrote:
> >>
> >>  Hello
> >>
> >>  node 1 is showing this in 'pcs status'
> >>
> >>  Pending Fencing Actions:
> >>  * reboot of kvm03-node02.avigol-gcs.dk pending: client=crmd.37819,
> >>  origin=kvm03-node03.avigol-gcs.dk
> >>
> >>  node 2 and node 3 outputs no such thing (node 3 is DC)
> >>
> >>  Google is not much help, how to investigate this further and get rid
> >>  of such terrifying status message ?
> >>
> >>  Regards
> >>  Steffen
> >>  ___
> >>  Manage your subscription:
> >>  https://lists.clusterlabs.org/mailman/listinfo/users
> >>
> >>  ClusterLabs home: https://www.clusterlabs.org/
> >>
> >
> >
> > --
> > Regards,
> >
> > Reid Wahl, RHCA
> > Senior Software Maintenance Engineer, Red Hat
> > CEE - Platform Support Delivery - ClusterHA
> >
> > ___
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > ClusterLabs home: https://www.clusterlabs.org/
> >
>
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-06 Thread Steffen Vinther Sørensen
Hi Reid,

Pacemaker version is the same on all 3 nodes, CentOS7 most recent update :

# pacemakerd --version
Pacemaker 1.1.23-1.el7_9.1
Written by Andrew Beekhof

# rpm -qa | grep pacemaker
pacemaker-cluster-libs-1.1.23-1.el7_9.1.x86_64
pacemaker-cli-1.1.23-1.el7_9.1.x86_64
pacemaker-1.1.23-1.el7_9.1.x86_64
pacemaker-libs-1.1.23-1.el7_9.1.x86_64

Maybe it was only fixed in Pacemaker 2.x ?



Also output of 'pcs status' and 'pcs status --full' differs.

Output of pcs status on node01:

Current DC: kvm03-node03.avigol-gcs.dk
...
Pending Fencing Actions:
* reboot of kvm03-node02.avigol-gcs.dk pending: client=crmd.37819,
origin=kvm03-node03.avigol-gcs.dk

Output of pcs status --full on node01:

Current DC: kvm03-node03.avigol-gcs.dk
...
Fencing History:
* reboot of kvm03-node02.avigol-gcs.dk pending: client=crmd.37819,
origin=kvm03-node03.avigol-gcs.dk



No changes after doing:

stonith_admin --cleanup --history '*'

Regards
Steffen


On Thu, Jan 7, 2021 at 7:06 AM Reid Wahl  wrote:
>
> Hi, Steffen. Are your cluster nodes all running the same Pacemaker
> versions? This looks like Bug 5401[1], which is fixed by upstream
> commit df71a07[2]. I'm a little bit confused about why it only shows
> up on one out of three nodes though.
>
> [1] https://bugs.clusterlabs.org/show_bug.cgi?id=5401
> [2] https://github.com/ClusterLabs/pacemaker/commit/df71a07
>
> On Tue, Jan 5, 2021 at 8:31 AM Steffen Vinther Sørensen
>  wrote:
> >
> > Hello
> >
> > node 1 is showing this in 'pcs status'
> >
> > Pending Fencing Actions:
> > * reboot of kvm03-node02.avigol-gcs.dk pending: client=crmd.37819,
> > origin=kvm03-node03.avigol-gcs.dk
> >
> > node 2 and node 3 outputs no such thing (node 3 is DC)
> >
> > Google is not much help, how to investigate this further and get rid
> > of such terrifying status message ?
> >
> > Regards
> > Steffen
> > ___
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > ClusterLabs home: https://www.clusterlabs.org/
> >
>
>
> --
> Regards,
>
> Reid Wahl, RHCA
> Senior Software Maintenance Engineer, Red Hat
> CEE - Platform Support Delivery - ClusterHA
>
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-06 Thread renayama19661014
Hi Steffen,

The fix pointed out by Reid is affecting it.

Since the fencing action requested by the DC node exists only in the DC node, 
such an event occurs.
You will need to take advantage of the modified pacemaker to resolve the issue.

Best Regards,
Hideo Yamauchi.



- Original Message -
> From: Reid Wahl 
> To: Cluster Labs - All topics related to open-source clustering welcomed 
> 
> Cc: 
> Date: 2021/1/7, Thu 15:07
> Subject: Re: [ClusterLabs] Pending Fencing Actions shown in pcs status
> 
> Hi, Steffen. Are your cluster nodes all running the same Pacemaker
> versions? This looks like Bug 5401[1], which is fixed by upstream
> commit df71a07[2]. I'm a little bit confused about why it only shows
> up on one out of three nodes though.
> 
> [1] https://bugs.clusterlabs.org/show_bug.cgi?id=5401 
> [2] https://github.com/ClusterLabs/pacemaker/commit/df71a07 
> 
> On Tue, Jan 5, 2021 at 8:31 AM Steffen Vinther Sørensen
>  wrote:
>> 
>>  Hello
>> 
>>  node 1 is showing this in 'pcs status'
>> 
>>  Pending Fencing Actions:
>>  * reboot of kvm03-node02.avigol-gcs.dk pending: client=crmd.37819,
>>  origin=kvm03-node03.avigol-gcs.dk
>> 
>>  node 2 and node 3 outputs no such thing (node 3 is DC)
>> 
>>  Google is not much help, how to investigate this further and get rid
>>  of such terrifying status message ?
>> 
>>  Regards
>>  Steffen
>>  ___
>>  Manage your subscription:
>>  https://lists.clusterlabs.org/mailman/listinfo/users 
>> 
>>  ClusterLabs home: https://www.clusterlabs.org/ 
>> 
> 
> 
> -- 
> Regards,
> 
> Reid Wahl, RHCA
> Senior Software Maintenance Engineer, Red Hat
> CEE - Platform Support Delivery - ClusterHA
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 
> 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Pending Fencing Actions shown in pcs status

2021-01-06 Thread Reid Wahl
Hi, Steffen. Are your cluster nodes all running the same Pacemaker
versions? This looks like Bug 5401[1], which is fixed by upstream
commit df71a07[2]. I'm a little bit confused about why it only shows
up on one out of three nodes though.

[1] https://bugs.clusterlabs.org/show_bug.cgi?id=5401
[2] https://github.com/ClusterLabs/pacemaker/commit/df71a07

On Tue, Jan 5, 2021 at 8:31 AM Steffen Vinther Sørensen
 wrote:
>
> Hello
>
> node 1 is showing this in 'pcs status'
>
> Pending Fencing Actions:
> * reboot of kvm03-node02.avigol-gcs.dk pending: client=crmd.37819,
> origin=kvm03-node03.avigol-gcs.dk
>
> node 2 and node 3 outputs no such thing (node 3 is DC)
>
> Google is not much help, how to investigate this further and get rid
> of such terrifying status message ?
>
> Regards
> Steffen
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>


-- 
Regards,

Reid Wahl, RHCA
Senior Software Maintenance Engineer, Red Hat
CEE - Platform Support Delivery - ClusterHA

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/