Re: [Pacemaker] crm resource restart fails to restart the service
Hi, On Thu, Nov 18, 2010 at 01:35:24PM -0500, Vadym Chepkov wrote: > On Wed, Nov 17, 2010 at 1:03 PM, Dejan Muhamedagic > wrote: > >> > > >> > Funny, it worked here for me every time for apache, Dummy, > >> > Delay, stonith resources. With both pacemaker 1.0 and 1.1. > >> > > >> >> To test it right now I issued a command > >> >> # crm resource restart xen_vbuild > >> > > >> > Can you try to insert a sleep and see if that helps. It's in > >> > /usr/lib64/python2.6/site-packages/crm/ui.py: > >> > > >> > 802 def restart(self,cmd,rsc): > >> > 803 "usage: restart " > >> > 804 if not is_name_sane(rsc): > >> > 805 return False > >> > 806 if not self.stop("stop",rsc): > >> > 807 return False > >> > 808 time.sleep(1) > >> > 809 return self.start("start",rsc) > >> > > >> > Thanks, > >> > > >> > Dejan > >> > >> > >> Yep, that did the trick > > > > OK. These nodes are faster than what I have (or the other way > > around), i.e. this seems to be timing issue. > > > > Thanks, > > > > Dejan > > > > well, I would say it's not normal, right? I guess not, but what do you really mean? :) > Are you going to include > this "sleep" in the stable-1.0 branch ? The sleep is currently included in the 1.1 branch, but it's not a proper fix. If there are dependencies which take time to stop then the restart will fail. In that case we'd need to wait for the transition to finish. Right now, the shell doesn't have such a facility, but should get one. > or maybe some op_defaults > reset_delay ? That's still not general enough. Thanks, Dejan > > Thanks, > Vadym > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] crm resource restart fails to restart the service
On Wed, Nov 17, 2010 at 1:03 PM, Dejan Muhamedagic wrote: >> > >> > Funny, it worked here for me every time for apache, Dummy, >> > Delay, stonith resources. With both pacemaker 1.0 and 1.1. >> > >> >> To test it right now I issued a command >> >> # crm resource restart xen_vbuild >> > >> > Can you try to insert a sleep and see if that helps. It's in >> > /usr/lib64/python2.6/site-packages/crm/ui.py: >> > >> > 802 def restart(self,cmd,rsc): >> > 803 "usage: restart " >> > 804 if not is_name_sane(rsc): >> > 805 return False >> > 806 if not self.stop("stop",rsc): >> > 807 return False >> > 808 time.sleep(1) >> > 809 return self.start("start",rsc) >> > >> > Thanks, >> > >> > Dejan >> >> >> Yep, that did the trick > > OK. These nodes are faster than what I have (or the other way > around), i.e. this seems to be timing issue. > > Thanks, > > Dejan > well, I would say it's not normal, right? Are you going to include this "sleep" in the stable-1.0 branch ? or maybe some op_defaults reset_delay ? Thanks, Vadym ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] crm resource restart fails to restart the service
On Wed, Nov 17, 2010 at 09:56:25AM -0500, Vadym Chepkov wrote: > > On Nov 17, 2010, at 9:46 AM, Dejan Muhamedagic wrote: > > > On Wed, Nov 17, 2010 at 08:30:36AM -0500, Vadym Chepkov wrote: > >> On Wed, Nov 17, 2010 at 8:01 AM, Dejan Muhamedagic > >> wrote: > >> > RA doesn't support restart action? Most luckily you get > OCF_ERR_UNIMPLEMENTED in the log > >>> > >>> It's actually a resource stop followed by start. It says so in > >>> the help too. Perhaps the start precludes the stop action. The > >>> logs should give a hint. We need a sleep in between. > >>> > >> > >> In this case this command is not working at all, because I tried in > >> the past for many resources and it never worked, so I just assumed it > >> has to be implemented by RA. > > > > Funny, it worked here for me every time for apache, Dummy, > > Delay, stonith resources. With both pacemaker 1.0 and 1.1. > > > >> To test it right now I issued a command > >> # crm resource restart xen_vbuild > > > > Can you try to insert a sleep and see if that helps. It's in > > /usr/lib64/python2.6/site-packages/crm/ui.py: > > > > 802 def restart(self,cmd,rsc): > > 803 "usage: restart " > > 804 if not is_name_sane(rsc): > > 805 return False > > 806 if not self.stop("stop",rsc): > > 807 return False > > 808 time.sleep(1) > > 809 return self.start("start",rsc) > > > > Thanks, > > > > Dejan > > > Yep, that did the trick OK. These nodes are faster than what I have (or the other way around), i.e. this seems to be timing issue. Thanks, Dejan > Now I see this: > > Nov 17 14:52:39 xen-11 Xen[1]: INFO: Xen domain vbuild will be stopped > (timeout: 220s) > Nov 17 14:52:40 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting... > Nov 17 14:52:44 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting… > Nov 17 14:52:45 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting... > Nov 17 14:52:47 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting... > Nov 17 14:52:48 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting... > Nov 17 14:52:50 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting... > Nov 17 14:52:54 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting... > Nov 17 14:52:55 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting... > Nov 17 14:53:00 xen-11 Xen[1]: INFO: Xen domain vbuild stopped. > > [r...@xen-11 ~]# xm list|grep build > vbuild18 511 2 -b 12.0 > > > > > > >> where xen_vbuild is a Xen VM and no results whatsoever. > >> > >> Here is output of the log > >> > >> Nov 17 13:04:13 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > >> Nov 17 13:05:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > >> Nov 17 13:06:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > >> Nov 17 13:07:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > >> + > >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > >> + >> __crm_diff_marker__="added:top" > > >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > >> +>> name="target-role" value="Stopped" /> > >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > >> - > >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > >> - > >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > >> ->> id="xen_vbuild-meta_attributes-target-role" /> > >> Nov 17 13:07:44 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state: > >> xen_vbuild: Overwriting calculated next role Unknown with requested > >> next role Stopped > >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > >> + > >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > >> + > >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > >> +>> id="xen_vbuild-meta_attributes-target-role" /> > >> Nov 17 13:07:45 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state: > >> xen_vbuild: Overwriting calculated next role Unknown with requested > >> next role Stopped > >> Nov 17 13:07:45 xen-11 pengine: [22958]: notice: native_print: > >> xen_vbuild (ocf::heartbeat:Xen): Started xen-11 > >> Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node: > >> All nodes for resource xen_vbuild are unavailable, unclean or shutting > >> down (xen-11: 1, -100) > >> Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node: > >> Could not allocate a node for xen_vbuild > >> Nov 17 13:07:45 xen-11 pengine: [22958]: info: native_color: Resource > >> xen_vbuild cannot run anywhere > >> Nov 17 13:07:45 xen-11 pengine: [22958]: notice: LogActions: Stop > >> resource xen_vbuild(xen-11) > >> Nov 17 13:07:46 xen-11
Re: [Pacemaker] crm resource restart fails to restart the service
Hi, Vadym Chepkov wrote: On Wed, Nov 17, 2010 at 8:01 AM, Dejan Muhamedagic wrote: RA doesn't support restart action? Most luckily you get OCF_ERR_UNIMPLEMENTED in the log It's actually a resource stop followed by start. It says so in the help too. Perhaps the start precludes the stop action. The logs should give a hint. We need a sleep in between. In this case this command is not working at all, because I tried in the past for many resources and it never worked, so I just assumed it has to be implemented by RA. To test it right now I issued a command # crm resource restart xen_vbuild where xen_vbuild is a Xen VM and no results whatsoever. Here is output of the log Nov 17 13:04:13 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:05:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:06:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:07:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: + Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: + Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: + Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: - Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: - Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: - Nov 17 13:07:44 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state: xen_vbuild: Overwriting calculated next role Unknown with requested next role Stopped Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: + Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: + Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: + Nov 17 13:07:45 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state: xen_vbuild: Overwriting calculated next role Unknown with requested next role Stopped Nov 17 13:07:45 xen-11 pengine: [22958]: notice: native_print: xen_vbuild (ocf::heartbeat:Xen): Started xen-11 Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node: All nodes for resource xen_vbuild are unavailable, unclean or shutting down (xen-11: 1, -100) Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node: Could not allocate a node for xen_vbuild Nov 17 13:07:45 xen-11 pengine: [22958]: info: native_color: Resource xen_vbuild cannot run anywhere Nov 17 13:07:45 xen-11 pengine: [22958]: notice: LogActions: Stop resource xen_vbuild (xen-11) Nov 17 13:07:46 xen-11 pengine: [22958]: notice: native_print: xen_vbuild (ocf::heartbeat:Xen): Started xen-11 Nov 17 13:07:46 xen-11 pengine: [22958]: debug: native_assign_node: Assigning xen-11 to xen_vbuild Nov 17 13:07:46 xen-11 pengine: [22958]: notice: LogActions: Leave resource xen_vbuild (Started xen-11) Nov 17 13:08:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:09:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:10:16 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:11:16 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:12:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:13:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:14:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:15:18 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:16:18 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:17:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:17:47 xen-11 pengine: [22958]: notice: native_print: xen_vbuild (ocf::heartbeat:Xen): Started xen-11 Nov 17 13:17:47 xen-11 pengine: [22958]: debug: native_assign_node: Assigning xen-11 to xen_vbuild Nov 17 13:17:47 xen-11 pengine: [22958]: notice: LogActions: Leave resource xen_vbuild (Started xen-11) Nov 17 13:18:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:19:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:20:20 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor but VM never stopped: [r...@xen-11 ~]# xm list|grep vbuild vbuild 3 511 2 -b352.4 still ID 3 as it was before In my case the custom OCF RA, works, after some tweaks, now I'm stuck with the mysql RA, I think this is the issue: /usr/lib/ocf/resource.d/heartbeat# ./mysql stop ./mysql: line 523: (/1000)-5: syntax error: operand expected (error token is "/1000)-5") First I thought it was because I set the monitor, start and stop timeouts to other values than the default, but even after setting the defaults, same thing. primitive mysqld ocf:heartbeat:mysql \ params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" enable_creation="0" datadir="/mysql/database" user="root" test_user="monitor" test_passwd=
Re: [Pacemaker] crm resource restart fails to restart the service
On Nov 17, 2010, at 9:46 AM, Dejan Muhamedagic wrote: > On Wed, Nov 17, 2010 at 08:30:36AM -0500, Vadym Chepkov wrote: >> On Wed, Nov 17, 2010 at 8:01 AM, Dejan Muhamedagic >> wrote: >> RA doesn't support restart action? Most luckily you get OCF_ERR_UNIMPLEMENTED in the log >>> >>> It's actually a resource stop followed by start. It says so in >>> the help too. Perhaps the start precludes the stop action. The >>> logs should give a hint. We need a sleep in between. >>> >> >> In this case this command is not working at all, because I tried in >> the past for many resources and it never worked, so I just assumed it >> has to be implemented by RA. > > Funny, it worked here for me every time for apache, Dummy, > Delay, stonith resources. With both pacemaker 1.0 and 1.1. > >> To test it right now I issued a command >> # crm resource restart xen_vbuild > > Can you try to insert a sleep and see if that helps. It's in > /usr/lib64/python2.6/site-packages/crm/ui.py: > > 802 def restart(self,cmd,rsc): > 803 "usage: restart " > 804 if not is_name_sane(rsc): > 805 return False > 806 if not self.stop("stop",rsc): > 807 return False > 808 time.sleep(1) > 809 return self.start("start",rsc) > > Thanks, > > Dejan Yep, that did the trick Now I see this: Nov 17 14:52:39 xen-11 Xen[1]: INFO: Xen domain vbuild will be stopped (timeout: 220s) Nov 17 14:52:40 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting... Nov 17 14:52:44 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting… Nov 17 14:52:45 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting... Nov 17 14:52:47 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting... Nov 17 14:52:48 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting... Nov 17 14:52:50 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting... Nov 17 14:52:54 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting... Nov 17 14:52:55 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting... Nov 17 14:53:00 xen-11 Xen[1]: INFO: Xen domain vbuild stopped. [r...@xen-11 ~]# xm list|grep build vbuild18 511 2 -b 12.0 > >> where xen_vbuild is a Xen VM and no results whatsoever. >> >> Here is output of the log >> >> Nov 17 13:04:13 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor >> Nov 17 13:05:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor >> Nov 17 13:06:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor >> Nov 17 13:07:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: >> + >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: >> + > __crm_diff_marker__="added:top" > >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: >> + > name="target-role" value="Stopped" /> >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: >> - >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: >> - >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: >> - > id="xen_vbuild-meta_attributes-target-role" /> >> Nov 17 13:07:44 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state: >> xen_vbuild: Overwriting calculated next role Unknown with requested >> next role Stopped >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: >> + >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: >> + >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: >> + > id="xen_vbuild-meta_attributes-target-role" /> >> Nov 17 13:07:45 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state: >> xen_vbuild: Overwriting calculated next role Unknown with requested >> next role Stopped >> Nov 17 13:07:45 xen-11 pengine: [22958]: notice: native_print: >> xen_vbuild (ocf::heartbeat:Xen): Started xen-11 >> Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node: >> All nodes for resource xen_vbuild are unavailable, unclean or shutting >> down (xen-11: 1, -100) >> Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node: >> Could not allocate a node for xen_vbuild >> Nov 17 13:07:45 xen-11 pengine: [22958]: info: native_color: Resource >> xen_vbuild cannot run anywhere >> Nov 17 13:07:45 xen-11 pengine: [22958]: notice: LogActions: Stop >> resource xen_vbuild (xen-11) >> Nov 17 13:07:46 xen-11 pengine: [22958]: notice: native_print: >> xen_vbuild (ocf::heartbeat:Xen): Started xen-11 >> Nov 17 13:07:46 xen-11 pengine: [22958]: debug: native_assign_node: >> Assigning xen-11 to xen_vbuild >> Nov 17 13:07:46 xen-11 pengine: [22958]: notice: LogActions: Leave >> resource xen_vbuild (Started xen-11) >> Nov 17 13:08:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor >> Nov 17 13:09:15 xen-11
Re: [Pacemaker] crm resource restart fails to restart the service
On Wed, Nov 17, 2010 at 08:30:36AM -0500, Vadym Chepkov wrote: > On Wed, Nov 17, 2010 at 8:01 AM, Dejan Muhamedagic > wrote: > > >> RA doesn't support restart action? Most luckily you get > >> OCF_ERR_UNIMPLEMENTED in the log > > > > It's actually a resource stop followed by start. It says so in > > the help too. Perhaps the start precludes the stop action. The > > logs should give a hint. We need a sleep in between. > > > > In this case this command is not working at all, because I tried in > the past for many resources and it never worked, so I just assumed it > has to be implemented by RA. Funny, it worked here for me every time for apache, Dummy, Delay, stonith resources. With both pacemaker 1.0 and 1.1. > To test it right now I issued a command > # crm resource restart xen_vbuild Can you try to insert a sleep and see if that helps. It's in /usr/lib64/python2.6/site-packages/crm/ui.py: 802 def restart(self,cmd,rsc): 803 "usage: restart " 804 if not is_name_sane(rsc): 805 return False 806 if not self.stop("stop",rsc): 807 return False 808 time.sleep(1) 809 return self.start("start",rsc) Thanks, Dejan > where xen_vbuild is a Xen VM and no results whatsoever. > > Here is output of the log > > Nov 17 13:04:13 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:05:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:06:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:07:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > + > Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > + __crm_diff_marker__="added:top" > > Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > +name="target-role" value="Stopped" /> > Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > - > Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > - > Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > -id="xen_vbuild-meta_attributes-target-role" /> > Nov 17 13:07:44 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state: > xen_vbuild: Overwriting calculated next role Unknown with requested > next role Stopped > Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > + > Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > + > Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > +id="xen_vbuild-meta_attributes-target-role" /> > Nov 17 13:07:45 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state: > xen_vbuild: Overwriting calculated next role Unknown with requested > next role Stopped > Nov 17 13:07:45 xen-11 pengine: [22958]: notice: native_print: > xen_vbuild(ocf::heartbeat:Xen): Started xen-11 > Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node: > All nodes for resource xen_vbuild are unavailable, unclean or shutting > down (xen-11: 1, -100) > Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node: > Could not allocate a node for xen_vbuild > Nov 17 13:07:45 xen-11 pengine: [22958]: info: native_color: Resource > xen_vbuild cannot run anywhere > Nov 17 13:07:45 xen-11 pengine: [22958]: notice: LogActions: Stop > resource xen_vbuild (xen-11) > Nov 17 13:07:46 xen-11 pengine: [22958]: notice: native_print: > xen_vbuild(ocf::heartbeat:Xen): Started xen-11 > Nov 17 13:07:46 xen-11 pengine: [22958]: debug: native_assign_node: > Assigning xen-11 to xen_vbuild > Nov 17 13:07:46 xen-11 pengine: [22958]: notice: LogActions: Leave > resource xen_vbuild (Started xen-11) > Nov 17 13:08:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:09:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:10:16 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:11:16 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:12:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:13:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:14:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:15:18 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:16:18 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:17:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:17:47 xen-11 pengine: [22958]: notice: native_print: > xen_vbuild(ocf::heartbeat:Xen): Started xen-11 > Nov 17 13:17:47 xen-11 pengine: [22958]: debug: native_assign_node: > Assigning xen-11 to xen_vbuild > Nov 17 13:17:47 xen-11 pengine: [22958]: notice: LogActions: Leave > resource xen_vbuild (Started xen-11) > Nov 17 13:18:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:19:19 xen-11 lrmd: [4295]: debug: rsc:xen_vb
Re: [Pacemaker] crm resource restart fails to restart the service
Hi, On Wed, Nov 17, 2010 at 08:30:36AM -0500, Vadym Chepkov wrote: > On Wed, Nov 17, 2010 at 8:01 AM, Dejan Muhamedagic > wrote: > > >> RA doesn't support restart action? Most luckily you get > >> OCF_ERR_UNIMPLEMENTED in the log > > > > It's actually a resource stop followed by start. It says so in > > the help too. Perhaps the start precludes the stop action. The > > logs should give a hint. We need a sleep in between. > > > > In this case this command is not working at all, because I tried in > the past for many resources and it never worked, so I just assumed it > has to be implemented by RA. > > To test it right now I issued a command > # crm resource restart xen_vbuild > > where xen_vbuild is a Xen VM and no results whatsoever. > > Here is output of the log > > Nov 17 13:04:13 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:05:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:06:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:07:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > + > Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > + __crm_diff_marker__="added:top" > > Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > +name="target-role" value="Stopped" /> > Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > - > Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > - > Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > -id="xen_vbuild-meta_attributes-target-role" /> > Nov 17 13:07:44 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state: > xen_vbuild: Overwriting calculated next role Unknown with requested > next role Stopped > Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > + > Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > + > Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: > +id="xen_vbuild-meta_attributes-target-role" /> > Nov 17 13:07:45 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state: > xen_vbuild: Overwriting calculated next role Unknown with requested > next role Stopped > Nov 17 13:07:45 xen-11 pengine: [22958]: notice: native_print: > xen_vbuild(ocf::heartbeat:Xen): Started xen-11 > Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node: > All nodes for resource xen_vbuild are unavailable, unclean or shutting > down (xen-11: 1, -100) > Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node: > Could not allocate a node for xen_vbuild > Nov 17 13:07:45 xen-11 pengine: [22958]: info: native_color: Resource > xen_vbuild cannot run anywhere > Nov 17 13:07:45 xen-11 pengine: [22958]: notice: LogActions: Stop > resource xen_vbuild (xen-11) > Nov 17 13:07:46 xen-11 pengine: [22958]: notice: native_print: > xen_vbuild(ocf::heartbeat:Xen): Started xen-11 > Nov 17 13:07:46 xen-11 pengine: [22958]: debug: native_assign_node: > Assigning xen-11 to xen_vbuild > Nov 17 13:07:46 xen-11 pengine: [22958]: notice: LogActions: Leave > resource xen_vbuild (Started xen-11) > Nov 17 13:08:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:09:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:10:16 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:11:16 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:12:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:13:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:14:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:15:18 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:16:18 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:17:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:17:47 xen-11 pengine: [22958]: notice: native_print: > xen_vbuild(ocf::heartbeat:Xen): Started xen-11 > Nov 17 13:17:47 xen-11 pengine: [22958]: debug: native_assign_node: > Assigning xen-11 to xen_vbuild > Nov 17 13:17:47 xen-11 pengine: [22958]: notice: LogActions: Leave > resource xen_vbuild (Started xen-11) > Nov 17 13:18:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:19:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > Nov 17 13:20:20 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor > > but VM never stopped: > > > [r...@xen-11 ~]# xm list|grep vbuild > vbuild 3 511 2 -b352.4 > > > still ID 3 as it was before I'll take a look. Thanks, Dejan > Vadym > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://w
Re: [Pacemaker] crm resource restart fails to restart the service
On Wed, Nov 17, 2010 at 8:01 AM, Dejan Muhamedagic wrote: >> RA doesn't support restart action? Most luckily you get >> OCF_ERR_UNIMPLEMENTED in the log > > It's actually a resource stop followed by start. It says so in > the help too. Perhaps the start precludes the stop action. The > logs should give a hint. We need a sleep in between. > In this case this command is not working at all, because I tried in the past for many resources and it never worked, so I just assumed it has to be implemented by RA. To test it right now I issued a command # crm resource restart xen_vbuild where xen_vbuild is a Xen VM and no results whatsoever. Here is output of the log Nov 17 13:04:13 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:05:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:06:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:07:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: + Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: + Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: + Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: - Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: - Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: - Nov 17 13:07:44 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state: xen_vbuild: Overwriting calculated next role Unknown with requested next role Stopped Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: + Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: + Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff: + Nov 17 13:07:45 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state: xen_vbuild: Overwriting calculated next role Unknown with requested next role Stopped Nov 17 13:07:45 xen-11 pengine: [22958]: notice: native_print: xen_vbuild (ocf::heartbeat:Xen): Started xen-11 Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node: All nodes for resource xen_vbuild are unavailable, unclean or shutting down (xen-11: 1, -100) Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node: Could not allocate a node for xen_vbuild Nov 17 13:07:45 xen-11 pengine: [22958]: info: native_color: Resource xen_vbuild cannot run anywhere Nov 17 13:07:45 xen-11 pengine: [22958]: notice: LogActions: Stop resource xen_vbuild (xen-11) Nov 17 13:07:46 xen-11 pengine: [22958]: notice: native_print: xen_vbuild (ocf::heartbeat:Xen): Started xen-11 Nov 17 13:07:46 xen-11 pengine: [22958]: debug: native_assign_node: Assigning xen-11 to xen_vbuild Nov 17 13:07:46 xen-11 pengine: [22958]: notice: LogActions: Leave resource xen_vbuild (Started xen-11) Nov 17 13:08:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:09:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:10:16 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:11:16 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:12:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:13:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:14:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:15:18 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:16:18 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:17:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:17:47 xen-11 pengine: [22958]: notice: native_print: xen_vbuild (ocf::heartbeat:Xen): Started xen-11 Nov 17 13:17:47 xen-11 pengine: [22958]: debug: native_assign_node: Assigning xen-11 to xen_vbuild Nov 17 13:17:47 xen-11 pengine: [22958]: notice: LogActions: Leave resource xen_vbuild (Started xen-11) Nov 17 13:18:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:19:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor Nov 17 13:20:20 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor but VM never stopped: [r...@xen-11 ~]# xm list|grep vbuild vbuild 3 511 2 -b352.4 still ID 3 as it was before Vadym ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] crm resource restart fails to restart the service
Hi, On Wed, Nov 17, 2010 at 07:35:46AM -0500, Vadym Chepkov wrote: > > On Nov 17, 2010, at 7:26 AM, Dan Frincu wrote: > > > Hi, > > > > r...@cluster1:/# pgrep mysql > > 961 > > 1127 > > r...@cluster1:/# crm resource restart mysqld > > r...@cluster1:/# pgrep -fl mysql > > 961 > > 1127 > > > > The restart command doesn't actually restart the process, I have tried this > > with another custom built OCF compliant RA and have the same issue. > > > > # rpm -qa '(pacemaker|corosync|resource-agents)' > > pacemaker-1.0.9.1-1.el5 > > resource-agents-1.0.3-2.el5 > > corosync-1.2.7-1.1.el5 > > > > # crm configure show mysqld > > primitive mysqld ocf:heartbeat:mysql \ > > params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" > > enable_creation="0" datadir="/mysql/database" user="root" > > test_user="monitor" test_passwd="monitor" test_table="cluster.monitor" \ > > op monitor interval="10s" timeout="5s" \ > > op start interval="0s" \ > > op stop interval="0s" \ > > meta target-role="Started" > > > > Ideas? > > > RA doesn't support restart action? Most luckily you get OCF_ERR_UNIMPLEMENTED > in the log It's actually a resource stop followed by start. It says so in the help too. Perhaps the start precludes the stop action. The logs should give a hint. We need a sleep in between. Thanks, Dejan > Vadym > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] crm resource restart fails to restart the service
On 17 November 2010 13:35, Vadym Chepkov wrote: > > On Nov 17, 2010, at 7:26 AM, Dan Frincu wrote: > >> Hi, >> >> r...@cluster1:/# pgrep mysql >> 961 >> 1127 >> r...@cluster1:/# crm resource restart mysqld >> r...@cluster1:/# pgrep -fl mysql >> 961 >> 1127 >> >> The restart command doesn't actually restart the process, I have tried this >> with another custom built OCF compliant RA and have the same issue. >> >> # rpm -qa '(pacemaker|corosync|resource-agents)' >> pacemaker-1.0.9.1-1.el5 >> resource-agents-1.0.3-2.el5 >> corosync-1.2.7-1.1.el5 >> >> # crm configure show mysqld >> primitive mysqld ocf:heartbeat:mysql \ >> params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" >> enable_creation="0" datadir="/mysql/database" user="root" >> test_user="monitor" test_passwd="monitor" test_table="cluster.monitor" \ >> op monitor interval="10s" timeout="5s" \ >> op start interval="0s" \ >> op stop interval="0s" \ >> meta target-role="Started" >> >> Ideas? > > > RA doesn't support restart action? Most luckily you get OCF_ERR_UNIMPLEMENTED > in the log > > Vadym > that is correct [r...@node-01 heartbeat]# pwd /usr/lib/ocf/resource.d/heartbeat [r...@node-01 heartbeat]# grep usage mysql # An example usage in /etc/ha.d/haresources: # See usage() function below for more details... usage() { usage: $0 (start|stop|validate-all|meta-data|monitor) usage|help) usage *) usage but in my case it supports restart ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] crm resource restart fails to restart the service
On Nov 17, 2010, at 7:26 AM, Dan Frincu wrote: > Hi, > > r...@cluster1:/# pgrep mysql > 961 > 1127 > r...@cluster1:/# crm resource restart mysqld > r...@cluster1:/# pgrep -fl mysql > 961 > 1127 > > The restart command doesn't actually restart the process, I have tried this > with another custom built OCF compliant RA and have the same issue. > > # rpm -qa '(pacemaker|corosync|resource-agents)' > pacemaker-1.0.9.1-1.el5 > resource-agents-1.0.3-2.el5 > corosync-1.2.7-1.1.el5 > > # crm configure show mysqld > primitive mysqld ocf:heartbeat:mysql \ > params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" > enable_creation="0" datadir="/mysql/database" user="root" test_user="monitor" > test_passwd="monitor" test_table="cluster.monitor" \ > op monitor interval="10s" timeout="5s" \ > op start interval="0s" \ > op stop interval="0s" \ > meta target-role="Started" > > Ideas? RA doesn't support restart action? Most luckily you get OCF_ERR_UNIMPLEMENTED in the log Vadym ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] crm resource restart fails to restart the service
On 17 November 2010 13:26, Dan Frincu wrote: > Hi, > > r...@cluster1:/# pgrep mysql > 961 > 1127 > r...@cluster1:/# crm resource restart mysqld > r...@cluster1:/# pgrep -fl mysql > 961 > 1127 > > The restart command doesn't actually restart the process, I have tried this > with another custom built OCF compliant RA and have the same issue. > > # rpm -qa '(pacemaker|corosync|resource-agents)' > pacemaker-1.0.9.1-1.el5 > resource-agents-1.0.3-2.el5 > corosync-1.2.7-1.1.el5 > > # crm configure show mysqld > primitive mysqld ocf:heartbeat:mysql \ > params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" > enable_creation="0" datadir="/mysql/database" user="root" > test_user="monitor" test_passwd="monitor" test_table="cluster.monitor" \ > op monitor interval="10s" timeout="5s" \ > op start interval="0s" \ > op stop interval="0s" \ > meta target-role="Started" > > Ideas? > > Regards, > Dan > > -- > Dan FRINCU > Systems Engineer > CCNA, RHCE > Streamwide Romania > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > I have experienced the same issue and created a bug report http://developerbugs.linux-foundation.org/show_bug.cgi?id=2516. In my case I have a group [1] and if I do crm resource restart pbx_01 the last resource(mailAlert-01) of the group is restarted. Cheers, Pavlos [1] group pbx_service_01 ip_01 fs_01 pbx_01 sshd_01 mailAlert-01 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker