On 02/01/2017 09:15 AM, Scott Greenlese wrote: > Hi all... > > Just a quick follow-up. > > Thought I should come clean and share with you that the incorrect > "migrate-to" operation name defined in my VirtualDomain > resource was my mistake. It was mis-coded in the virtual guest > provisioning script. I have since changed it to "migrate_to" > and of course, the specified live migration timeout value is working > effectively now. (For some reason, I assumed we were letting that > operation meta value default). > > I was wondering if someone could refer me to the definitive online link > for pacemaker resource man pages? I don't see any resource man pages > installed > on my system anywhere. I found this one online: > https://www.mankier.com/7/ocf_heartbeat_VirtualDomain but is there a > more 'official' page I should refer our > Linux KVM on System z customers to?
All distributions that I know of include the man pages with the packages they distribute. Are you building from source? They are named like "man ocf_heartbeat_IPaddr2". FYI after following this thread, the pcs developers are making a change so that pcs refuses to add an unrecognized operation unless the user uses --force. Thanks for being involved in the community; this is how we learn to improve! > Thanks again for your assistance. > > Scott Greenlese ...IBM KVM on System Z Solution Test Poughkeepsie, N.Y. > INTERNET: swgre...@us.ibm.com > > > Inactive hide details for "Ulrich Windl" ---01/27/2017 02:32:43 AM--->>> > "Scott Greenlese" <swgre...@us.ibm.com> schrieb am 27."Ulrich Windl" > ---01/27/2017 02:32:43 AM--->>> "Scott Greenlese" <swgre...@us.ibm.com> > schrieb am 27.01.2017 um 02:47 in Nachricht > > From: "Ulrich Windl" <ulrich.wi...@rz.uni-regensburg.de> > To: <users@clusterlabs.org>, Scott Greenlese/Poughkeepsie/IBM@IBMUS > Cc: "Si Bo Niu" <nius...@cn.ibm.com>, Michael Tebolt/Poughkeepsie/IBM@IBMUS > Date: 01/27/2017 02:32 AM > Subject: Antw: Re: [ClusterLabs] Antw: Re: Live Guest Migration timeouts > for VirtualDomain resources > > ------------------------------------------------------------------------ > > > >>>> "Scott Greenlese" <swgre...@us.ibm.com> schrieb am 27.01.2017 um > 02:47 in > Nachricht > <of63cd0e10.d58c4c3d-on002580b5.0005c410-852580b5.0009d...@notes.na.collabserv.c > > m>: > >> Hi guys.. >> >> Well, today I confirmed that what Ulrich said is correct. If I update the >> VirtualDomain resource with the operation name "migrate_to" instead of >> "migrate-to", it effectively overrides and enforces the 1200ms default >> value to the new value. >> >> I am wondering how I would have known that I was using the wrong operation >> name, when the initial operation name is already incorrect >> when the resource is created? > > For SLES 11, I made a quick (portable non-portable unstable) try (print > the operations known to an RA): > # crm ra info VirtualDomain |sed -n -e "/Operations' defaults/,\$p" > Operations' defaults (advisory minimum): > > start timeout=90 > stop timeout=90 > status timeout=30 interval=10 > monitor timeout=30 interval=10 > migrate_from timeout=60 > migrate_to timeout=120 > > Regards, > Ulrich > >> >> This is what the meta data for my resource looked like after making the >> update: >> >> [root@zs95kj VD]# date;pcs resource update zs95kjg110065_res op migrate_to >> timeout="360s" >> Thu Jan 26 16:43:11 EST 2017 >> You have new mail in /var/spool/mail/root >> >> [root@zs95kj VD]# date;pcs resource show zs95kjg110065_res >> Thu Jan 26 16:43:46 EST 2017 >> Resource: zs95kjg110065_res (class=ocf provider=heartbeat >> type=VirtualDomain) >> Attributes: config=/guestxml/nfs1/zs95kjg110065.xml >> hypervisor=qemu:///system migration_transport=ssh >> Meta Attrs: allow-migrate=true >> Operations: start interval=0s timeout=120 >> (zs95kjg110065_res-start-interval-0s) >> stop interval=0s timeout=120 >> (zs95kjg110065_res-stop-interval-0s) >> monitor interval=30s > (zs95kjg110065_res-monitor-interval-30s) >> migrate-from interval=0s timeout=1200 >> (zs95kjg110065_res-migrate-from-interval-0s) >> migrate-to interval=0s timeout=1200 >> (zs95kjg110065_res-migrate-to-interval-0s) <<< Original op name / value >> migrate_to interval=0s timeout=360s >> (zs95kjg110065_res-migrate_to-interval-0s) <<< New op name / value >> >> >> Where does that original op name come from in the VirtualDomain resource >> definition? How can we get the initial meta value changed and shipped > with >> a valid operation name (i.e. migrate_to), and >> maybe a more reasonable migrate_to timeout value... something > significantly >> higher than 1200ms , i.e. 1.2 seconds ? Can I report this request as a >> bugzilla on the RHEL side, or should this go to my internal IBM bugzilla >> for KVM on System Z development? >> >> Anyway, thanks so much for identifying my issue. I can reconfigure my >> resources to make them tolerate longer migration execution times. >> >> >> Scott Greenlese ... IBM KVM on System Z Solution Test >> INTERNET: swgre...@us.ibm.com >> >> >> >> >> From: Ken Gaillot <kgail...@redhat.com> >> To: Ulrich Windl <ulrich.wi...@rz.uni-regensburg.de>, >> users@clusterlabs.org >> Date: 01/19/2017 10:26 AM >> Subject: Re: [ClusterLabs] Antw: Re: Live Guest Migration timeouts for >> VirtualDomain resources >> >> >> >> On 01/19/2017 01:36 AM, Ulrich Windl wrote: >>>>>> Ken Gaillot <kgail...@redhat.com> schrieb am 18.01.2017 um 16:32 in >> Nachricht >>> <4b02d3fa-4693-473b-8bed-dc98f9e3f...@redhat.com>: >>>> On 01/17/2017 04:45 PM, Scott Greenlese wrote: >>>>> Ken and Co, >>>>> >>>>> Thanks for the useful information. >>>>> >>> >>> [...] >>>>> >>>>> Is this internally coded within the class=ocf provider=heartbeat >>>>> type=VirtualDomain resource agent? >>>> >>>> Aha, I just realized what the issue is: the operation name is >>>> migrate_to, not migrate-to. >>>> >>>> For technical reasons, pacemaker can't validate operation names (at the >>>> time that the configuration is edited, it does not necessarily have >>>> access to the agent metadata). >>> >>> BUT the set of operations is finite, right? So if those were in some XML >> schema, the names could be verified at least (not meaning that the >> operation is actually supported). >>> BTW: Would a "crm configure verify" detect this kijnd of problem? >>> >>> [...] >>> >>> Ulrich >> >> Yes, it's in the resource agent meta-data. While pacemaker itself uses a >> small set of well-defined actions, the agent may define any arbitrarily >> named actions it desires, and the user could configure one of these as a >> recurring action in pacemaker. >> >> Pacemaker itself has to be liberal about where its configuration comes >> from -- the configuration can be edited on a separate machine, which >> doesn't have resource agents, and then uploaded to the cluster. So >> Pacemaker can't do that validation at configuration time. (It could >> theoretically do some checking after the fact when the configuration is >> loaded, but this could be a lot of overhead, and there are >> implementation issues at the moment.) >> >> Higher-level tools like crmsh and pcs, on the other hand, can make >> simplifying assumptions. They can require access to the resource agents >> so that they can do extra validation. _______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org