Re: [Linux-HA] trouble with CRM/XEN

Greg Woods Wed, 07 Apr 2010 09:17:37 -0700

On Wed, 2010-04-07 at 15:39 +0200, Andrew Beekhof wrote:
>  I increased the timeout
> > even further (to 120s instead of the minimum recommended 60) and it
> > seems to be working. Curious though, because when it does work, the logs
> > show that the entire stop operation, including a live migration, takes
> > only about 7 seconds.
> 
> It depends on what else the machine is doing.
> Are there any other Xen instances that might be migrating too?


The test cluster currently has two Xen VM's, one is tied to a particular
DRBD volume, so it has colocation and order constraints so that it must
shut down, wait for the DRBD/Filesystem/LVM stack to fail over, and
restart. Still, even that doesn't take more than 60 seconds. The other
VM is stored on an NFS volume so that it can live migrate
(allow-migrate="true"). I have seen failures of the stop operation on
both of them prior to increasing the timeout.

Surely it's not handling the resources sequentially? That will be a
disaster if we get to where I want to be going, which may involve dozens
or even hundreds of VMs on a cluster. I realize I may have to adjust the
timeout up higher for the simple reason that a few dozen VM's shutting
down in parallel is going to take longer than one or two in parallel due
to sharing of host OS resources, but hopefully the timeout won't be a
linear function of the number of VMs.

--Greg


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] trouble with CRM/XEN

Reply via email to