> > # Check for mandatory files presence and consistency
 > > vmware_validate() {
 > > if [ -z "`pidof vmware-hostd`" ]; then
 > > - ocf_log warn "vmware-hostd is not running"
 > > - return $OCF_ERR_GENERIC
 > > + ocf_log err "vmware-hostd is not running: aborting"
 > > + exit $OCF_ERR_GENERIC
 > > fi

 > Changing warn to err and return to exit is obviously fine, but you may
 > actually want to make this $OCF_ERR_INSTALLED.

Ok changed that

 > If you are running into VMware "randomly unregistering" VMs, can you be
 > confident that it'll work on the second try? Wouldn't it be smarter to
 > just keep trying until you are timed out?

Yeah, i'll use a "while" there as well

 > Sorry, we can't do that. The resource agent has already been released
 > under the name "vmware", so people are probably using this is
 > production. Renaming the RA now would break their setups. Of course this
 > can be worked around by adding a symlink, but I really don't see the 
point.

Not a problem, leave it as "vmware"


 > > <actions>
 > > -<action name="start" timeout="300" />
 > > -<action name="stop" timeout="300" />
 > > -<action name="monitor" timeout="30" interval="300" depth="0" />
 > > +<action name="start" timeout="900" />
 > > +<action name="stop" timeout="900" />
 > > +<action name="monitor" timeout="30" interval="300" depth="0" 
start-delay="60" />
 > > <action name="meta-data" timeout="5" />
 > > </actions>
 > > </resource-agent>
 >
 > Those timeouts look really excessive. Are you sure you want to declare
 > _15 minutes_ the recommended minimum?

I know it looks excessive, but if you, for example, make a node standby 
with 6-7 virtual machines with many gigs of virtual RAM trust me: 
they'll take forever to suspend and resume.
Using virtual machines with 5-10 gigs of ram I noticed 5 minutes are 
often not enough, maybe 10 mins would be a decent compromise.

 > And, please kill start-delay. People can configure that if they really
 > want to, but the RA really shouldn't advertise this

Removed

I'm working on the changes you suggested

P.S. We can't call "validate-all" before "monitor": that causes the 
script to fail with exit code "2" instead of "7" if called on a node 
without access to the filesystem containing the VM...
I should have checked this stuff deeply :-/

-- 
Cristian Mammoli
APRA SISTEMI srl
Via Brodolini,6 Jesi (AN)
tel dir. 0731 719822

Web   www.apra.it
e-mail  [email protected]
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to