Hi Alan, I've started testing my OCF script, and I'm seeing something unusual during initial startup. I've placed a crm_master call in my stateful_start function, after the function has determined that it is running on what should be the master, and postgresql has successfully started:
crm_master -v 100 When this command gets executed, it starts using nearly 100% CPU, memory usage continuously increases up to about 68%, then it dies (killed via timeout?), followed by a second attempt to go master (with the same charactistics, after the function timeout is exceeded), then a demote is sent (again, after timeout) and it switches to try to become the slave (crm_master -v 10 is what I use, though I'm not sure this is correct usage to say "I want to change to a slave). Eventually, I wind up with the resource in failed mode. First question, any idea why the straight line running of a crm_master -v 100 (not within any loops in my script) would spin up to 100%? Second question, is using the crm_master -v with different values the way to say on which node I prefer the master to run (higher number = preferred node)? Doug On Thu, 2007-03-22 at 07:46 -0400, Doug Knight wrote: > Thank you Alan, that explanation really helped. Would it be useful for > me to post my OCF script once its done and tested? > > Doug > > On Wed, 2007-03-21 at 19:13 -0600, Alan Robertson wrote: > > Doug Knight wrote: > > > Hi Andrew, > > > I had just started reviewing both of thes scripts, and reviewed the > > > Multistate and clone resource pages on the web site. It looks like > > > multistate is how I need to handle it, but a couple of questions first. > > > > > > 1. I noticed that the write-up says the resource must come up on each of > > > the servers in "shadow" mode first, then one gets promoted. Does this > > > imply a "start" on both servers, and the OCF start function determining > > > which server is active vs shadow (I'm picturing a check in the OCF > > > script to determine postgresql standby mode = shadow/crm_master value > > > low, and postgresql active mode = active/crm_master value high), then a > > > promote to the active server? > > > > > > 2. I noticed that the drbd OCF script contains a "notify" function, > > > where the Stateful OCF script does not. The notify function looks to be > > > where the important actions are taken (calling drbd_start_phase_2, > > > pre/post, etc). Is the notify function necessary, or is it sufficient in > > > my case to handle it through the start|stop|promote|demote functions? > > > > > > Thanks for your help, > > > Doug > > > > Andrew's out for a while. > > > > The start function starts you up in slave/secondary mode. All resources > > initially start up in "slave" mode. > > > > A set of servers is chosen to run the resources on (it might be one, > > two, the whole set, etc. depending on clone_max and clone_node_max and > > the usual constraints). > > > > They are started on the selected nodes using "start" > > > > During the start operation, you are given the chance to declare yourself > > ready to become master or not by using the crm_master command line tool. > > > > I believe that your resource can run that command any time they like - > > for example at a monitor operation... But, it is mandatory that they > > run it when they first start up. > > > > After this, heartbeat will try and promote as many of these resources as > > is consistent with its configured properties, and the crm_master > > commands that were run. > > > > The notify command tells you when your peers come and go. Do you need > > to take actions if you know this? > > > > If so, then you need to implement the notify actions... > > > > > > > _______________________________________________________ > Linux-HA-Dev: [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > Home Page: http://linux-ha.org/
_______________________________________________________ Linux-HA-Dev: [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
