[ClusterLabs] Resources suddenly get target-role="stopped"

2015-12-04 Thread Klechomir
Hi list, My issue is the following: I have very stable cluster, using Corosync 2.1.0.26 and Pacemaker 1.1.8 (observed the same problem with Corosync 2.3.5 & Pacemaker 1.1.13-rc3) Bumped on this issue when started playing with VirtualDomain resources, but this seems to be unrelated to the RA.

Re: [ClusterLabs] Resources suddenly get target-role="stopped"

2015-12-07 Thread Klechomir
Hi Ken, The comments are in the text. On 4.12.2015 19:06, Ken Gaillot wrote: On 12/04/2015 10:22 AM, Klechomir wrote: Hi list, My issue is the following: I have very stable cluster, using Corosync 2.1.0.26 and Pacemaker 1.1.8 (observed the same problem with Corosync 2.3.5 & Pacemaker 1.

Re: [ClusterLabs] Resources suddenly get target-role="stopped"

2015-12-07 Thread Klechomir
using a failover resource Filesystem_CDrive1, because pacemaker monitor resource on both nodes to check if they are running in multiple nodes. 2015-12-04 18:06 GMT+01:00 Ken Gaillot <kgail...@redhat.com>: On 12/04/2015 10:22 AM, Klechomir wrote: Hi list, My issue is the following: I hav

Re: [ClusterLabs] Early VM resource migration

2015-12-17 Thread Klechomir
, On 16.12.2015 11:08:35 Ken Gaillot wrote: > On 12/16/2015 10:30 AM, Klechomir wrote: > > On 16.12.2015 17:52, Ken Gaillot wrote: > >> On 12/16/2015 02:09 AM, Klechomir wrote: > >>> Hi list, > >>> I have a cluster with VM resources on a cloned active-

Re: [ClusterLabs] Antw: Re: Early VM resource migration

2015-12-17 Thread Klechomir
Hi Ulrich, This is only a part of the config, which concerns the problem. Even with dummy resources, the behaviour will be identical, so don't think that dlm/clvmd res. config will help solving the problem. Regards, KIecho On 17.12.2015 08:19:43 Ulrich Windl wrote: > >>> Kl

Re: [ClusterLabs] Antw: Re: Antw: Re: Early VM resource migration

2015-12-17 Thread Klechomir
VM_VM1 gets immediately stopped as soon as node1 re-appears and stays down until its "order/colocation AA resource" comes up on node1. The curious part is that in the opposite case (node2 comes from standby), the failback is ok. Regards, On 17.12.2015 14:51:21 Ulrich Windl wrote: >

Re: [ClusterLabs] Antw: Re: Antw: Re: Early VM resource migration

2016-01-08 Thread Klechomir
indl wrote: >>> Klechomir <kle...@gmail.com> schrieb am 17.12.2015 um 14:16 in Nachricht <2102747.TPh6pTdk8c@bobo>: > Hi Ulrich, > This is only a part of the config, which concerns the problem. > Even with dummy resources, the behaviour will be identical, so don't thin

[ClusterLabs] Sudden stop of pacemaker functions

2016-02-17 Thread Klechomir
Hi List, Having strange issue lately. I have two node cluster with some cloned resources on it. One of my nodes suddenly starts reporting all its resources down (some of them are actually running), stops logging and reminds in this this state forever, while still responding to crm commands.

Re: [ClusterLabs] Sudden stop of pacemaker functions

2016-02-17 Thread Klechomir
/pacemaker gave me very annoying delay, when commiting configuration changes (maybe it's a known problem?). Best regards, Klecho On 17.02.2016 14:59, Jan Pokorný wrote: On 17/02/16 14:10 +0200, Klechomir wrote: Having strange issue lately. I have two node cluster with some cloned resources

Re: [ClusterLabs] Stonith ignores resource stop errors

2016-03-10 Thread Klechomir
Thanks, That was it! On 10.03.2016 17:08, Ken Gaillot wrote: On 03/10/2016 04:42 AM, Klechomir wrote: Hi List I'm testing stonith now (pacemaker 1.1.8), and noticed that it properly kills a node with stopped pacemaker, but ignores resource stop errors. I'm pretty sure that the same version

Re: [ClusterLabs] Antw: Re: kvm live migration, resource moving

2016-04-27 Thread Klechomir
Hi List, I have two major problems related to the VirtualDomain live migration and failover in general. I'm using Pacemaker 1.1.8, which is very stable and used to do everything right to me. 1. Live migration during failover (node standby or shutdown) is ok, but during failback, the VMs

Re: [ClusterLabs] Instant service restart during failback

2017-04-20 Thread Klechomir
Hi Klaus, It would have been too easy if it was interleave. All my cloned resoures have interlave=true, of course. What bothers me more is that the behaviour is asymmetrical. Regards, Klecho On 20.4.2017 10:43:29 Klaus Wenninger wrote: > On 04/20/2017 10:30 AM, Klechomir wrote: > >

[ClusterLabs] Instant service restart during failback

2017-04-20 Thread Klechomir
Hi List, Been investigating the following problem recently: Have two node cluster with 4 cloned (2 on top of 2) + 1 master/slave services on it (corosync+pacemaker 1.1.15) The failover works properly for both nodes, i.e. when one node is restarted/turned in standby, the other properly takes

Re: [ClusterLabs] Pacemaker 1.1.18 deprecation warnings

2017-09-19 Thread Klechomir
Hi Ken, Any plans that there will be "lost monitoring request" handling (an added syntax) in 2.0? Regards, Klecho On 18.9.2017 12:48:38 Ken Gaillot wrote: > As discussed at the recent ClusterLabs Summit, I plan to start the > release cycle for Pacemaker 1.1.18 soon. > > There will be the usual

Re: [ClusterLabs] Antw: Re: Antw: Is there a way to ignore a single monitoring timeout

2017-09-01 Thread Klechomir
On 1 Sep 2017, at 13:15, Klechomir <kle...@gmail.com <mailto:kle...@gmail.com>> wrote: What I observe is that single monitoring request of different resources with different resource agents is timing out. For example LVM resource (the LVM RA) does this sometimes. Setting ridic

[ClusterLabs] Is there a way to ignore a single monitoring timeout

2017-08-31 Thread Klechomir
Hi List, I've been fighting with one problem with many different kinds of resources. While the resource is ok, without apparent reason, once in a while there is no response to a monitoring request, which leads to a monitoring timeout and resource restart etc Is there any way to ignore one

Re: [ClusterLabs] Is there a way to ignore a single monitoring timeout

2017-08-31 Thread Klechomir
Will be looking forward for this feature, will save a lot of trouble. Regards, Klecho On 31.8.2017 10:31:21 Ken Gaillot wrote: > On Thu, 2017-08-31 at 18:18 +0300, Klechomir wrote: > > Hi List, > > I've been fighting with one problem with many different kinds of > > reso

Re: [ClusterLabs] Antw: Is there a way to ignore a single monitoring timeout

2017-09-01 Thread Klechomir
Hi Ulrich, Have to disagree here. I have cases, when for an unknown reason a single monitoring request never returns result. So having bigger timeouts doesn't resolve this problem. Best regards, Klecho On 1.09.2017 09:11, Ulrich Windl wrote: Klechomir <kle...@gmail.com> schr

Re: [ClusterLabs] Antw: Re: Antw: Is there a way to ignore a single monitoring timeout

2017-09-01 Thread Klechomir
there. Same for other I/O related resources/RAs. Regards, Klecho One of the typical cases is LVM (LVM RA)monitoring. On 1.09.2017 11:07, Jehan-Guillaume de Rorthais wrote: On Fri, 01 Sep 2017 09:07:16 +0200 "Ulrich Windl" <ulrich.wi...@rz.uni-regensburg.de> wrote: Klechomir &

Re: [ClusterLabs] Antw: VirtualDomain & parallel shutdown

2018-11-20 Thread Klechomir
Hi Ulrich, The stop timeout needs te quite big for an obvius reason called "System is updating, do not turn it off". The main question here is why this slow shutdown would prevent the other VMs from being shut down? Regards, On 20.11.2018 11:54:58 Ulrich Windl wrote: > >>&

[ClusterLabs] VirtualDomain & parallel shutdown

2018-11-20 Thread Klechomir
Hi list, Bumped onto the following issue lately: When ultiple VMs are given shutdown right one-after-onther and the shutdown of the first VM takes long, the others aren't being shut down at all before the first doesn't stop. "batch-limit" doesn't seem to affect this. Any suggestions why this