On 18/09/15 07:30, Luca Bertoncello wrote:
Hi all,
thank you very much for your answers.
So:
1)Of course, we have UPS. More than one, in our server room, and of
course they will send an advice to the host if they are on battery
Good.
2)My question was: “what can I do, so that in case of Kernel Panic or
similar, the VM will be migrated (live or not) to another host?”
You would make the VMs HA and acquire a fencing solution.
3)I’d like to have a shutdown-script on the host that put the host in
Maintenance and wait until it’s done, so that I can just shutdown or
reboot it without any other action. Is it possible? It would help to
manage the power failure, too, assuming that other hosts have better
UPS (it can be possible…)
You could probably use the REST API on the Ovirt Engine for that.But it
might be better to have a highly available machine (VM or not) running
something like Nagios or Icinga which would perform the monitoring of
your hosts and connect to the REST API to perform maintenance and
shutdown. You might also consider a UPS service like NUT (unless you're
already doing it).
Cheers
Alex
Thanks a lot
Mit freundlichen Grüßen
Luca Bertoncello
--
Besuchen Sie unsere Webauftritte:
www.queo.biz <http://www.queo.biz/>
Agentur für Markenführung und Kommunikation
www.queoflow.com <http://www.queoflow.com/>
IT-Consulting und Individualsoftwareentwicklung
Luca Bertoncello
Administrator
Telefon:
+49 351 21 30 38 0
Fax:
+49 351 21 30 38 99
E-Mail:
l.bertonce...@queo-group.com <mailto:l.bertonce...@queo-group.com>
queo GmbH
Tharandter Str. 13
01159 Dresden
Sitz der Gesellschaft: Dresden
Handelsregistereintrag: Amtsgericht Dresden HRB 22352
Geschäftsführer: Rüdiger Henke, André Pinkert
USt-IdNr.: DE234220077
*From:*users-boun...@ovirt.org [mailto:users-boun...@ovirt.org] *On
Behalf Of *matthew lagoe
*Sent:* Thursday, September 17, 2015 9:56 PM
*To:* 'Alex Crow'; 'Yaniv Kaul'
*Cc:* users@ovirt.org
*Subject:* Re: [ovirt-users] Automatically migrate VM between hosts in
the same cluster
There are PDU’s that you can monitor power draw per port and that
would kind of tell you if a PSU failed as the load would be 0
*From:*users-boun...@ovirt.org <mailto:users-boun...@ovirt.org>
[mailto:users-boun...@ovirt.org] *On Behalf Of *Alex Crow
*Sent:* Thursday, September 17, 2015 12:31 PM
*To:* Yaniv Kaul <yk...@redhat.com <mailto:yk...@redhat.com>>
*Cc:* users@ovirt.org <mailto:users@ovirt.org>
*Subject:* Re: [ovirt-users] Automatically migrate VM between hosts in
the same cluster
I don't really think this is practical:
- If the PSU failed, your UPS could alert you. If you have one...
If you have only one PSU in a host, a UPS is not going to stop you
losing all the VMs on that host. OK, if you had N+1 PSUs, you may be
able to monitor for this (IPMI/LOM/DRAC etc)and use the API to put a
host into maintenance. Also a lot of people rely on low-cost white-box
servers and decide that it's OK if a single PSU in a host dies, as,
well, we have HA to start on other hosts. If they have N+1 PSUs in the
hosts do they really have to migrate everything off? Swings and
roundabouts really.
I'm also not sure I've seen any practical DC setups where a UPS can
monitor the load for every single attached physical machine and figure
out that one of the redundant PSUs in it has failed - I'd love to know
if there are as that would be really cool.
- If the machine is going down in an ordinary flow, surely it can
be done.
Isn't that what "Maintenance mode" is for?
Â
Even if it was a network failure and the host was still up,
how would you live migrate a VM from a host you can't even
talk to?
It could be suspended to disk (local) - if the disk is available.
Then the decision if it is to be resumed from local disk or not
(as it might be HA'ed and is running elsewhere) need to be taken
later, of course.
Yes, but that's not even remotely possible with Ovirt right now. I was
trying to be practical as the OP has only just started using Ovirt and
I think it might be a bit much to ask him to start coding up what he'd
like.
Â
The only way you could do it was if you somehow magically knew
far enough in advance that the host was about to fail (!) and
that gave enough time to migrate the machines off. But how
would you ever know that "machine quux.bar.net
<http://quux.bar.net> is going to fail in 7 minutes"?
I completely agree there are situations in which you can't foresee
the failure.Â
But in many, you can. In those cases, it makes sense for the host
to self-initiate 'move to maintenance' mode. The policy of what to
do when 'self-moving-to-maintenance-mode' could be pre-fetched
from the engine.
Y.
Hmm, I would love that to be true. But I've seen so many so called
"corner-cases" that I now think the failure area in a datacenter is a
fractal with infinite corners. Yes, you could monitor SMART on local
drives, pick up uncorrected ECC errors, use "sensors" to check for
sagging voltages or high temps, but I don't think you can ever hope to
catch everything, and you could end up doing a migration "storm" for .
I've had more than enough of "Enterprise Spec" switches suddenly going
nuts and spamming corrupt MACs all over the LAN to know you can't ever
account for everything.
I think it's better to adopt the model of redundancy in software and
services, so no-one even notices if a VM host goes away, there's
always something else to take up the slack. Just like the origins of
the Internet - the network should be dumb and the applications should
cope with it! Any infrastructure that can't cope with the loss of a
few VMs for a few minutes probably needs a refresh.
Cheers
Alex
.
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
.
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users