On Fri, Mar 19, 2010 at 08:49:17AM +0100, Alain.Moulle wrote:
> Hi,
> no need to join hakvm as it does something very simple :
> test ping on vm
> exit 1 if ping fails
> else
> ssh vm "/usr/sbin/hastatus"
> if hastatus ok
> exit 0
> else
> exit 1
> (where hastatus gives a status on the service running inside the vm)
> and that'all !
>
> I'll try to get the log you're asking for, but I've workarounded the
> problem by setting a file-flag "vm-running-on-this-node" as soon as
> a first ssh vm "hostname" is successful, and if there is not this file-flag
> yet on the node, my hakvm returns 0 even if ping fails and even if
> ssh vm fails. This way, it works as if the vm was "started" even
> it is in fact "starting" , and of course I manage the remove of file-flag
> when necessary (stop, migrate, etc.)
> It seems to work fine.
> Perhaps all this is sufficient as answer to your question Lars and you
> don't need log anymore ?
It clears something up.
You say "after the first ssh vm hostname is successful"
You are at first able to ssh into vm,
then later, you are (temporarily?) no longer able to do so?
Maybe the network inside the VM is reconfigured?
Bridges, iptables, routing strangeness, arp conflicts, IP conflicts?
Load on the host or vm prevents the "hastatus" to
return in time?
Maybe you just need to add some timeout + retry around the
"ping vm" and/or "ssh vm"
in your hakvm script, or increase various timeouts?
Lars
--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems