Re: [Linux-ha-dev] Ordering of OCF Start, Stop and Monitor actions

Doug Knight Fri, 23 Mar 2007 06:14:17 -0800

Hi Alan,
I've started testing my OCF script, and I'm seeing something unusual
during initial startup. I've placed a crm_master call in my
stateful_start function, after the function has determined that it is
running on what should be the master, and postgresql has successfully
started:


crm_master -v 100

When this command gets executed, it starts using nearly 100% CPU, memory
usage continuously increases up to about 68%, then it dies (killed via
timeout?), followed by a second attempt to go master (with the same
charactistics, after the function timeout is exceeded), then a demote is
sent (again, after timeout) and it switches to try to become the slave
(crm_master -v 10 is what I use, though I'm not sure this is correct
usage to say "I want to change to a slave). Eventually, I wind up with
the resource in failed mode.

First question, any idea why the straight line running of a crm_master
-v 100 (not within any loops in my script) would spin up to 100%?

Second question, is using the crm_master -v with different values the
way to say on which node I prefer the master to run (higher number =
preferred node)?

Doug

On Thu, 2007-03-22 at 07:46 -0400, Doug Knight wrote:
> Thank you Alan, that explanation really helped. Would it be useful for
> me to post my OCF script once its done and tested?
> 
> Doug
> 
> On Wed, 2007-03-21 at 19:13 -0600, Alan Robertson wrote: 
> > Doug Knight wrote:
> > > Hi Andrew,
> > > I had just started reviewing both of thes scripts, and reviewed the
> > > Multistate and clone resource pages on the web site. It looks like
> > > multistate is how I need to handle it, but a couple of questions first.
> > > 
> > > 1. I noticed that the write-up says the resource must come up on each of
> > > the servers in "shadow" mode first, then one gets promoted. Does this
> > > imply a "start" on both servers, and the OCF start function determining
> > > which server is active vs shadow (I'm picturing a check in the OCF
> > > script to determine postgresql standby mode = shadow/crm_master value
> > > low, and postgresql active mode = active/crm_master value high), then a
> > > promote to the active server?
> > > 
> > > 2. I noticed that the drbd OCF script contains a "notify" function,
> > > where the Stateful OCF script does not. The notify function looks to be
> > > where the important actions are taken (calling drbd_start_phase_2,
> > > pre/post, etc). Is the notify function necessary, or is it sufficient in
> > > my case to handle it through the start|stop|promote|demote functions?
> > > 
> > > Thanks for your help,
> > > Doug
> > 
> > Andrew's out for a while.
> > 
> > The start function starts you up in slave/secondary mode.  All resources
> > initially start up in "slave" mode.
> > 
> > A set of servers is chosen to run the resources on (it might be one,
> > two, the whole set, etc. depending on clone_max and clone_node_max and
> > the usual constraints).
> > 
> > They are started on the selected nodes using "start"
> > 
> > During the start operation, you are given the chance to declare yourself
> > ready to become master or not by using the crm_master command line tool.
> > 
> > I believe that your resource can run that command any time they like -
> > for example at a monitor operation...  But, it is mandatory that they
> > run it when they first start up.
> > 
> > After this, heartbeat will try and promote as many of these resources as
> > is consistent with its configured properties, and the crm_master
> > commands that were run.
> > 
> > The notify command tells you when your peers come and go.  Do you need
> > to take actions if you know this?
> > 
> > If so, then you need to implement the notify actions...
> > 
> > 
> > 
> _______________________________________________________
> Linux-HA-Dev: [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/

_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Re: [Linux-ha-dev] Ordering of OCF Start, Stop and Monitor actions

Reply via email to