> > # Check for mandatory files presence and consistency
> > vmware_validate() {
> > if [ -z "`pidof vmware-hostd`" ]; then
> > - ocf_log warn "vmware-hostd is not running"
> > - return $OCF_ERR_GENERIC
> > + ocf_log err "vmware-hostd is not running: aborting"
> > + exit $OCF_ERR_GENERIC
> > fi
> Changing warn to err and return to exit is obviously fine, but you may
> actually want to make this $OCF_ERR_INSTALLED.
Ok changed that
> If you are running into VMware "randomly unregistering" VMs, can you be
> confident that it'll work on the second try? Wouldn't it be smarter to
> just keep trying until you are timed out?
Yeah, i'll use a "while" there as well
> Sorry, we can't do that. The resource agent has already been released
> under the name "vmware", so people are probably using this is
> production. Renaming the RA now would break their setups. Of course this
> can be worked around by adding a symlink, but I really don't see the
point.
Not a problem, leave it as "vmware"
> > <actions>
> > -<action name="start" timeout="300" />
> > -<action name="stop" timeout="300" />
> > -<action name="monitor" timeout="30" interval="300" depth="0" />
> > +<action name="start" timeout="900" />
> > +<action name="stop" timeout="900" />
> > +<action name="monitor" timeout="30" interval="300" depth="0"
start-delay="60" />
> > <action name="meta-data" timeout="5" />
> > </actions>
> > </resource-agent>
>
> Those timeouts look really excessive. Are you sure you want to declare
> _15 minutes_ the recommended minimum?
I know it looks excessive, but if you, for example, make a node standby
with 6-7 virtual machines with many gigs of virtual RAM trust me:
they'll take forever to suspend and resume.
Using virtual machines with 5-10 gigs of ram I noticed 5 minutes are
often not enough, maybe 10 mins would be a decent compromise.
> And, please kill start-delay. People can configure that if they really
> want to, but the RA really shouldn't advertise this
Removed
I'm working on the changes you suggested
P.S. We can't call "validate-all" before "monitor": that causes the
script to fail with exit code "2" instead of "7" if called on a node
without access to the filesystem containing the VM...
I should have checked this stuff deeply :-/
--
Cristian Mammoli
APRA SISTEMI srl
Via Brodolini,6 Jesi (AN)
tel dir. 0731 719822
Web www.apra.it
e-mail [email protected]
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/