Andrew,
Once this discussion is finished, and If what you like done is not in the current implementation can you please open a bug/feature request for it?

Thanks,

Dafna

On 01/27/2014 12:59 PM, Tareq Alayan wrote:
Adding Eli.


On 01/27/2014 02:50 PM, Andrew Lau wrote:
Hi,

I think he was asking what if the power management device reported that the host was powered off. Then VMs should be brought back up as being off would essentially be the same as running a power cycle/reboot?

Another example I'm seeing is what happens if the whole host loses power and it's power management device then becomes unavailable (ie. not reachable) then you're stuck in the case where it requires manual intervention.

I would be interested to potentially see something like a timeout on those problematic VMs (eg. if nothing was read or write after x amount of time) then you could consider the host as offline? I guess then that adds a lot of risk..


On Mon, Jan 27, 2014 at 11:43 PM, Tareq Alayan <tala...@redhat.com <mailto:tala...@redhat.com>> wrote:

    Hi,

    Power management makes use of special *dedicated* hardware in
    order to restart hosts independently of host OS. The engine
    connects to a power management devices using a *dedicated*
    network IP address.
    The engine is capable of rebooting hosts that have entered a
    non-operational or non-responsive state,
    The abilities provided by all power management devices are: check
    status, start, stop and recycle (restart)...

    In the case of non-responsive host: all of the VMs that are
    currently running on that host can also become non-responsive.
    However, the non-responsive host keeps locking the VM hard disk
    for all VMs it is running. Attempting to start a VM on a
    different host and assign the second host write privileges for
    the virtual machine hard disk image can cause data corruption.
    Rebooting allows the engine to assume that the lock on a VM hard
    disk image has been released.
    The engine can know for sure that the problematic host has been
    rebooted via the power management device and then it can start a
    VM from the problematic host on another host without risking data
    corruption.
    Important note: A virtual machine that has been marked
    highly-available can not be safely started on a different host
    without the certainty that doing so will not cause data corruption.

    N-joy,

    --Tareq




    On 01/27/2014 02:05 PM, Dafna Ron wrote:

        I am adding Tareq for the Power Management implementation.

        Dafna


        On 01/27/2014 11:48 AM, Karli Sjöberg wrote:

            On Mon, 2014-01-27 at 11:11 +0000, Dafna Ron wrote:

                Powering off the host will never trigger vm migration.
                As far as engine is concerned it just lost connection
                to the host, but
                has no way of telling if the host is down or if a
                router is down.

            Can´t it at least check with power management if the Host
            status is down
            first?

            I mean, if the network is down there will be no response
            from either PM
            or Host. But if PM is up and can tell you that the Host
            is down, sounds
            rather clear cut to me...

            Seems to me the VM's would be restarted sooner if the
            flow was altered
            to first check with PM if it´s a network or Host issue,
            and if Host
            issue, immediately restart VM's on another Host, instead
            of waiting for
            a potentially problematic Host to boot up eventually.

            /K

                since vm's can continue running on the host even if
                engine has no access
                to it, starting the vm's on the second host can cause
                split brain and
                data corruption.

                The way that the engine knows what's going on is by
                sending heath check
                queries to the vdsm.
                Power management will try to reboot a host when the
                health checks to
                vdsm will not be answered.
                So... if engine gets no reply and has no way of
                rebooting the host, the
                host status will be changed to Non-Responsive and the
                vm's will be
                unknown because engine has no way of knowing what's
                happening with the
                vm's.
                Since reboot of the host will kill the vm's running
                on it - this will
                never cause any vm migration but... along with the
                High-Availability vm
                feature, you will be able to have some of the vm's
                re-started on the
                second host after the host reboot (and that is only
                if Power Management
                was confirmed as successful).

                VM migration is only triggered when:
                1. Cluster configuration states that the vm should be
                migrated in case
                of failure
                2. Engine has access to the host - so the failure is
                on the storage side
                and not the host side.
                3. the vms are not actively writing (although there
                might be a new RFE
                for it).

                hope this clears things up

                Dafna



                On 01/27/2014 10:11 AM, Andrew Lau wrote:

                    Hi,

                    Have you got power management enabled?

                    That's the fencing feature required for the
                    engine to ensure that the
                    host is actually offline. It won't resume any
                    other VMs to prevent
                    potential VM corruption (eg. VM running on
                    multiple hosts).

                    Andrew.

                    On Jan 27, 2014 5:12 PM, "Jaison peter"
                    <urotr...@gmail.com <mailto:urotr...@gmail.com>
                    <mailto:urotr...@gmail.com
                    <mailto:urotr...@gmail.com>>> wrote:

                         Hi all ,

                         I was setting a two node ovirt cluster with
                    ovirt engine on
                         seperate node . I completed the
                    configuration and tested VM  live
                         migrations with out any issues . Then for
                    checking cluster HA I
                         powered down one host and expected vms
                    running on that host to be
                         migrated to the other one . But nothing
                    happened , Engine detected
                         host as un-rechable and marked it as
                    non-operational and vm ran on
                         that host went to 'unknown state' . Is that
                    not possible to setup
                         a fully HA ovirt cluster with two nodes ? or
                    else is that my
                         configuration problem ? please advice .

                         Thanks & Regards

                         Alex

                     _______________________________________________
                         Users mailing list
                    Users@ovirt.org <mailto:Users@ovirt.org>
                    <mailto:Users@ovirt.org <mailto:Users@ovirt.org>>
                    http://lists.ovirt.org/mailman/listinfo/users



                    _______________________________________________
                    Users mailing list
                    Users@ovirt.org <mailto:Users@ovirt.org>
                    http://lists.ovirt.org/mailman/listinfo/users


-- Dafna Ron
                _______________________________________________
                Users mailing list
                Users@ovirt.org <mailto:Users@ovirt.org>
                http://lists.ovirt.org/mailman/listinfo/users










--
Dafna Ron
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to