Hi all... Just a quick follow-up.
Thought I should come clean and share with you that the incorrect "migrate-to" operation name defined in my VirtualDomain resource was my mistake. It was mis-coded in the virtual guest provisioning script. I have since changed it to "migrate_to" and of course, the specified live migration timeout value is working effectively now. (For some reason, I assumed we were letting that operation meta value default). I was wondering if someone could refer me to the definitive online link for pacemaker resource man pages? I don't see any resource man pages installed on my system anywhere. I found this one online: https://www.mankier.com/7/ocf_heartbeat_VirtualDomain but is there a more 'official' page I should refer our Linux KVM on System z customers to? Thanks again for your assistance. Scott Greenlese ...IBM KVM on System Z Solution Test Poughkeepsie, N.Y. INTERNET: swgre...@us.ibm.com From: "Ulrich Windl" <ulrich.wi...@rz.uni-regensburg.de> To: <users@clusterlabs.org>, Scott Greenlese/Poughkeepsie/IBM@IBMUS Cc: "Si Bo Niu" <nius...@cn.ibm.com>, Michael Tebolt/Poughkeepsie/IBM@IBMUS Date: 01/27/2017 02:32 AM Subject: Antw: Re: [ClusterLabs] Antw: Re: Live Guest Migration timeouts for VirtualDomain resources >>> "Scott Greenlese" <swgre...@us.ibm.com> schrieb am 27.01.2017 um 02:47 in Nachricht <of63cd0e10.d58c4c3d-on002580b5.0005c410-852580b5.0009d...@notes.na.collabserv.c m>: > Hi guys.. > > Well, today I confirmed that what Ulrich said is correct. If I update the > VirtualDomain resource with the operation name "migrate_to" instead of > "migrate-to", it effectively overrides and enforces the 1200ms default > value to the new value. > > I am wondering how I would have known that I was using the wrong operation > name, when the initial operation name is already incorrect > when the resource is created? For SLES 11, I made a quick (portable non-portable unstable) try (print the operations known to an RA): # crm ra info VirtualDomain |sed -n -e "/Operations' defaults/,\$p" Operations' defaults (advisory minimum): start timeout=90 stop timeout=90 status timeout=30 interval=10 monitor timeout=30 interval=10 migrate_from timeout=60 migrate_to timeout=120 Regards, Ulrich > > This is what the meta data for my resource looked like after making the > update: > > [root@zs95kj VD]# date;pcs resource update zs95kjg110065_res op migrate_to > timeout="360s" > Thu Jan 26 16:43:11 EST 2017 > You have new mail in /var/spool/mail/root > > [root@zs95kj VD]# date;pcs resource show zs95kjg110065_res > Thu Jan 26 16:43:46 EST 2017 > Resource: zs95kjg110065_res (class=ocf provider=heartbeat > type=VirtualDomain) > Attributes: config=/guestxml/nfs1/zs95kjg110065.xml > hypervisor=qemu:///system migration_transport=ssh > Meta Attrs: allow-migrate=true > Operations: start interval=0s timeout=120 > (zs95kjg110065_res-start-interval-0s) > stop interval=0s timeout=120 > (zs95kjg110065_res-stop-interval-0s) > monitor interval=30s (zs95kjg110065_res-monitor-interval-30s) > migrate-from interval=0s timeout=1200 > (zs95kjg110065_res-migrate-from-interval-0s) > migrate-to interval=0s timeout=1200 > (zs95kjg110065_res-migrate-to-interval-0s) <<< Original op name / value > migrate_to interval=0s timeout=360s > (zs95kjg110065_res-migrate_to-interval-0s) <<< New op name / value > > > Where does that original op name come from in the VirtualDomain resource > definition? How can we get the initial meta value changed and shipped with > a valid operation name (i.e. migrate_to), and > maybe a more reasonable migrate_to timeout value... something significantly > higher than 1200ms , i.e. 1.2 seconds ? Can I report this request as a > bugzilla on the RHEL side, or should this go to my internal IBM bugzilla > for KVM on System Z development? > > Anyway, thanks so much for identifying my issue. I can reconfigure my > resources to make them tolerate longer migration execution times. > > > Scott Greenlese ... IBM KVM on System Z Solution Test > INTERNET: swgre...@us.ibm.com > > > > > From: Ken Gaillot <kgail...@redhat.com> > To: Ulrich Windl <ulrich.wi...@rz.uni-regensburg.de>, > users@clusterlabs.org > Date: 01/19/2017 10:26 AM > Subject: Re: [ClusterLabs] Antw: Re: Live Guest Migration timeouts for > VirtualDomain resources > > > > On 01/19/2017 01:36 AM, Ulrich Windl wrote: >>>>> Ken Gaillot <kgail...@redhat.com> schrieb am 18.01.2017 um 16:32 in > Nachricht >> <4b02d3fa-4693-473b-8bed-dc98f9e3f...@redhat.com>: >>> On 01/17/2017 04:45 PM, Scott Greenlese wrote: >>>> Ken and Co, >>>> >>>> Thanks for the useful information. >>>> >> >> [...] >>>> >>>> Is this internally coded within the class=ocf provider=heartbeat >>>> type=VirtualDomain resource agent? >>> >>> Aha, I just realized what the issue is: the operation name is >>> migrate_to, not migrate-to. >>> >>> For technical reasons, pacemaker can't validate operation names (at the >>> time that the configuration is edited, it does not necessarily have >>> access to the agent metadata). >> >> BUT the set of operations is finite, right? So if those were in some XML > schema, the names could be verified at least (not meaning that the > operation is actually supported). >> BTW: Would a "crm configure verify" detect this kijnd of problem? >> >> [...] >> >> Ulrich > > Yes, it's in the resource agent meta-data. While pacemaker itself uses a > small set of well-defined actions, the agent may define any arbitrarily > named actions it desires, and the user could configure one of these as a > recurring action in pacemaker. > > Pacemaker itself has to be liberal about where its configuration comes > from -- the configuration can be edited on a separate machine, which > doesn't have resource agents, and then uploaded to the cluster. So > Pacemaker can't do that validation at configuration time. (It could > theoretically do some checking after the fact when the configuration is > loaded, but this could be a lot of overhead, and there are > implementation issues at the moment.) > > Higher-level tools like crmsh and pcs, on the other hand, can make > simplifying assumptions. They can require access to the resource agents > so that they can do extra validation. > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org
_______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org