On Wed, Apr 7, 2010 at 6:17 PM, Greg Woods <[email protected]> wrote:
> On Wed, 2010-04-07 at 15:39 +0200, Andrew Beekhof wrote:
>>  I increased the timeout
>> > even further (to 120s instead of the minimum recommended 60) and it
>> > seems to be working. Curious though, because when it does work, the logs
>> > show that the entire stop operation, including a live migration, takes
>> > only about 7 seconds.
>>
>> It depends on what else the machine is doing.
>> Are there any other Xen instances that might be migrating too?
>
> The test cluster currently has two Xen VM's, one is tied to a particular
> DRBD volume, so it has colocation and order constraints so that it must
> shut down, wait for the DRBD/Filesystem/LVM stack to fail over, and
> restart. Still, even that doesn't take more than 60 seconds. The other
> VM is stored on an NFS volume so that it can live migrate
> (allow-migrate="true"). I have seen failures of the stop operation on
> both of them prior to increasing the timeout.
>
> Surely it's not handling the resources sequentially?

Not unless you told it to with an ordering constraint.
But if the cluster is doing VM ops in parallel, that could well be the
reason you need to increase the timeouts to make it work reliably.

We've seen this elsewhere when trying to migrate several VMs to the
same machine.
The target just can't keep up.

We've even added a serialization construct to tell Pacemaker to
stagger the operations.

> That will be a
> disaster if we get to where I want to be going, which may involve dozens
> or even hundreds of VMs on a cluster. I realize I may have to adjust the
> timeout up higher for the simple reason that a few dozen VM's shutting
> down in parallel is going to take longer than one or two in parallel due
> to sharing of host OS resources, but hopefully the timeout won't be a
> linear function of the number of VMs.
>
> --Greg
>
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to