On 10/15/2012 03:56 PM, Sven Knohsalla wrote:
Hi,

sometimes one hypervisors status turns to „Non-operational“ with error
“STORAGE_DOMAIN_UNREACHABLE” and the live-migration (activated for all
VMs) is starting.

I don’t currently know why the ovirt-node turns to this status, because
the connected iSCSI SAN is available all the time(checked via iscsi
session and lsblk), I’m also able to r/w on the SAN during that time.

We can simply activate this ovirt-node and it turns up again. The
migration process is running from scratch and hitting the some error
àReboot of ovirt-node necessary!

When a hypervisor turns to “non-operational” status, the live migration
is starting and tries to migrate ~25 VMs (~ 100 GB RAM to migrate).

During that process the network workload goes 100%, some VMs will be
migrated, then the destination host also turns to “non-operational”
status with error “STORAGE_DOMAIN_UNREACHABLE”.

Many VMs are still running on their  origin host, some are paused, some
are showing “migration from” status.

After a reboot of the origin host, the VMs turns of course into unknown
state.

So the whole cluster is down :/

For this problem I have some questions:

-Does ovirt engine just use the ovirt-mgmt network for migration/HA?

yes.


-If so, is there any possibility to *add*/switch a network for migration/HA?

you can bond, not yet add another one.


-Is the kind of way we are using the live-migration not recommended?

-Which engine module checks the availability of the storage domain for
the ovirt-nodes?

the engine.


-Is there any timeout/cache option we can set/increase to avoid this
problem?

well, not clear what the problem is.
also, vdsm is supposed to throttle live migration to 3 vm's in parallel iirc. also, you can at cluster level configure to not live migrate VMs on non-operational status.


-Is there any known problem with the versions we are using? (Migration
to ovirt-engine 3.1 is not possible atm)

oh, the cluster level migration policy on non operational may be a 3.1 feature, not sure.


-Is it possible to modify the migration queue to just migrate a max. of
4 VMs at the same time for example?

yes, there is a vdsm config for that. i am pretty sure 3 is the default though?


_ovirt-engine: _

FC 16:  3.3.6-3.fc16.x86_64

Engine: 3.0.0_0001-1.6.fc16

KVM based VM: 2 vCPU, 4 GB RAM

1 NIC for ssh/https access
1 NIC for ovirtmgmt network access
engine source: dreyou repo

_ovirt-node:_
Node: 2.3.0
2 bonded NICs -> Frontend Network
4 Multipath NICs -> SAN connection

Attached some relevant logfiles.

Thanks in advance, I really appreciate your help!

Best,

Sven Knohsalla |System Administration

Office +49 631 68036 433 | Fax +49 631 68036 111
|[email protected] <mailto:[email protected]>|
Skype: Netbiscuits.admin

Netbiscuits GmbH | Europaallee 10 | 67657 | GERMANY



_______________________________________________
Users mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/users



_______________________________________________
Users mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to