subject:"Re\: \[ovirt\-users\] oVirt 4.0.3 \(Hosted Engine\) \- High Availability VM not restart after auto\-fencing of host."

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-16 Thread Michal Skrivanek


> On 16 Sep 2016, at 16:34, aleksey.maksi...@it-kb.ru wrote:
> 
> Тested.
> 
> If I run 'shutdown -h now' on host with running HA VM (not HostedEngine VM)...
> 
> in oVirt web-console appears event:
> 
> Sep 16, 2016 5:13:18 PM VM KOM-AD01-PBX02 is down. Exit message: User shut 
> down from within the guest

that would be another bug. It should be recognized properly as a “kill”. Can 
you please share host logs from this attempt as well?

> 
> HA VM is turned off and will not start on another host.
> 
> This journald log from HA VM guest OS:
> 
> ...
> Sep 16 17:06:48 KOM-AD01-PBX02 python[2637]: [100B blob data]
> Sep 16 17:06:53 KOM-AD01-PBX02 systemd-timesyncd[1739]: Timed out waiting for 
> reply from 91.189.91.157:123 (ntp.ubuntu.com).
> Sep 16 17:07:03 KOM-AD01-PBX02 systemd-timesyncd[1739]: Timed out waiting for 
> reply from 91.189.89.199:123 (ntp.ubuntu.com).
> Sep 16 17:07:13 KOM-AD01-PBX02 systemd-timesyncd[1739]: Timed out waiting for 
> reply from 91.189.89.198:123 (ntp.ubuntu.com).
> Sep 16 17:07:23 KOM-AD01-PBX02 systemd-timesyncd[1739]: Timed out waiting for 
> reply from 91.189.94.4:123 (ntp.ubuntu.com).
> Sep 16 17:08:48 KOM-AD01-PBX02 python[2637]: [90B blob data]
> Sep 16 17:08:49 KOM-AD01-PBX02 python[2637]: [155B blob data]
> Sep 16 17:08:49 KOM-AD01-PBX02 python[2637]: [100B blob data]
> Sep 16 17:10:49 KOM-AD01-PBX02 python[2637]: [90B blob data]
> Sep 16 17:10:50 KOM-AD01-PBX02 python[2637]: [155B blob data]
> Sep 16 17:10:50 KOM-AD01-PBX02 python[2637]: [100B blob data]
> -- Reboot --
> ...
> 
> Before shutting down in the log no termination procedures.
> It looks like a rough poweroff the VM

yep, that is expected. But it should be properly detected as such and HE VM 
should restart. Somehow vdsm misidentifies the reason for the shutdown.

> 
> 16.09.2016, 17:08, "Simone Tiraboschi" :
>> On Fri, Sep 16, 2016 at 4:02 PM,  wrote:
>>> So, colleagues.
>>> I again tested the Fencing and now I think that my host-server power-button 
>>> (physically or through ILO) sends a KILL-command to the host OS (and as a 
>>> result to VM)
>>> This journald log in my guest OS when I press the power-button on the host:
>>> 
>>> ...
>>> Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Stopping ACPI event daemon...
>>> Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Stopping User Manager for UID 
>>> 1000...
>>> Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Starting Unattended Upgrades 
>>> Shutdown...
>>> Sep 16 16:19:27 KOM-AD01-PBX02 snapd[2583]: 2016/09/16 16:19:27.289063 
>>> main.go:67: Exiting on terminated signal.
>>> Sep 16 16:19:27 KOM-AD01-PBX02 sshd[2940]: pam_unix(sshd:session): session 
>>> closed for user user
>>> Sep 16 16:19:27 KOM-AD01-PBX02 su[3015]: pam_unix(su:session): session 
>>> closed for user root
>>> Sep 16 16:19:27 KOM-AD01-PBX02 spice-vdagentd[2638]: vdagentd quiting, 
>>> returning status 0
>>> Sep 16 16:19:27 KOM-AD01-PBX02 sudo[3014]: pam_unix(sudo:session): session 
>>> closed for user root
>>> Sep 16 16:19:27 KOM-AD01-PBX02 /usr/lib/snapd/snapd[2583]: main.go:67: 
>>> Exiting on terminated signal.
>>> Sep 16 16:19:27 KOM-AD01-PBX02 sshd[2812]: Received signal 15; terminating.
>>> ...
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Unmount All 
>>> Filesystems.
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped target Local File 
>>> Systems (Pre).
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopping Monitoring of LVM2 
>>> mirrors, snapshots etc. using dmeventd or progress polling...
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Remount Root and Kernel 
>>> File Systems.
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Create Static Device 
>>> Nodes in /dev.
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Shutdown.
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Final Step.
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Starting Reboot...
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Monitoring of LVM2 
>>> mirrors, snapshots etc. using dmeventd or progress polling.
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Shutting down.
>>> Sep 16 16:19:28 KOM-AD01-PBX02 kernel: [drm:qxl_enc_commit [qxl]] *ERROR* 
>>> head number too large or missing monitors config: c984a000, 
>>> 0systemd-shutdown[1]: Sending SIGTERM to remaining processes...
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd-journald[3342]: Journal stopped
>>> -- Reboot --
>>> 
>>> Perhaps this feature of HP ProLiant DL 360 G5. I dont know.
>>> 
>>> If I test the unavailability of a host other ways that everything is going 
>>> well.
>>> 
>>> I described my experience testing Fencing on practical examples on my blog 
>>> for everyone in Russian.
>>> https://blog.it-kb.ru/2016/09/16/install-ovirt-4-0-part-4-about-ssh-soft-fencing-and-hard-fencing-over-hp-proliant-ilo2-power-managment-agent-and-test-of-high-availability/
>>> 
>>> Thank you all very much for your participation and support.
>>> 
>>>

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-16 Thread aleksey . maksimov

"your VM would be killed uncleanly."

This is not a good idea, I think


16.09.2016, 17:14, "Michal Skrivanek" :
>>  On 16 Sep 2016, at 16:02, aleksey.maksi...@it-kb.ru wrote:
>>
>>  So, colleagues.
>>  I again tested the Fencing and now I think that my host-server power-button 
>> (physically or through ILO) sends a KILL-command to the host OS (and as a 
>> result to VM)
>
> thanks for confirmation, then it is indeed 
> https://bugzilla.redhat.com/show_bug.cgi?id=1341106
>
> I’m not sure if there is any good workaround. You can always 
> reconfigure(disable) ACPI in the guest, then HA logic would work ok but it 
> also means there is no graceful shutdown and your VM would be killed 
> uncleanly.
>
>>  This journald log in my guest OS when I press the power-button on the host:
>>
>>  ..
>>  Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Stopping ACPI event daemon...
>>  Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Stopping User Manager for UID 
>> 1000...
>>  Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Starting Unattended Upgrades 
>> Shutdown...
>>  Sep 16 16:19:27 KOM-AD01-PBX02 snapd[2583]: 2016/09/16 16:19:27.289063 
>> main.go:67: Exiting on terminated signal.
>>  Sep 16 16:19:27 KOM-AD01-PBX02 sshd[2940]: pam_unix(sshd:session): session 
>> closed for user user
>>  Sep 16 16:19:27 KOM-AD01-PBX02 su[3015]: pam_unix(su:session): session 
>> closed for user root
>>  Sep 16 16:19:27 KOM-AD01-PBX02 spice-vdagentd[2638]: vdagentd quiting, 
>> returning status 0
>>  Sep 16 16:19:27 KOM-AD01-PBX02 sudo[3014]: pam_unix(sudo:session): session 
>> closed for user root
>>  Sep 16 16:19:27 KOM-AD01-PBX02 /usr/lib/snapd/snapd[2583]: main.go:67: 
>> Exiting on terminated signal.
>>  Sep 16 16:19:27 KOM-AD01-PBX02 sshd[2812]: Received signal 15; terminating.
>>  ..
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Unmount All 
>> Filesystems.
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped target Local File 
>> Systems (Pre).
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopping Monitoring of LVM2 
>> mirrors, snapshots etc. using dmeventd or progress polling...
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Remount Root and Kernel 
>> File Systems.
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Create Static Device 
>> Nodes in /dev.
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Shutdown.
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Final Step.
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Starting Reboot...
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Monitoring of LVM2 
>> mirrors, snapshots etc. using dmeventd or progress polling.
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Shutting down.
>>  Sep 16 16:19:28 KOM-AD01-PBX02 kernel: [drm:qxl_enc_commit [qxl]] *ERROR* 
>> head number too large or missing monitors config: c984a000, 
>> 0systemd-shutdown[1]: Sending SIGTERM to remaining processes...
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd-journald[3342]: Journal stopped
>>  -- Reboot --
>>
>>  Perhaps this feature of HP ProLiant DL 360 G5. I dont know.
>>
>>  If I test the unavailability of a host other ways that everything is going 
>> well.
>>
>>  I described my experience testing Fencing on practical examples on my blog 
>> for everyone in Russian.
>>  
>> https://blog.it-kb.ru/2016/09/16/install-ovirt-4-0-part-4-about-ssh-soft-fencing-and-hard-fencing-over-hp-proliant-ilo2-power-managment-agent-and-test-of-high-availability/
>>
>>  Thank you all very much for your participation and support.
>>
>>  Michal, what kind of scenario are you talking about?
>>
>>  PS: Excuse me for my bad English :)
>>
>>  16.09.2016, 16:37, "Simone Tiraboschi" :
>>>  On Fri, Sep 16, 2016 at 3:34 PM, Michal Skrivanek 
>>>  wrote:
>  On 16 Sep 2016, at 15:31, aleksey.maksi...@it-kb.ru wrote:
>
>  Hi Simone.
>  Exactly.
>  Now I'll put the journald on the guest and try to understand how the 
> guest off.

  great. thanks

>  16.09.2016, 16:25, "Simone Tiraboschi" :
>>  On Fri, Sep 16, 2016 at 3:13 PM, Michal Skrivanek 
>>  wrote:
  On 16 Sep 2016, at 15:05, Gianluca Cecchi  
 wrote:

  On Fri, Sep 16, 2016 at 2:50 PM, Michal Skrivanek 
  wrote:
>  no, that’s not how HA works today. When you log into a guest and 
> issue “shutdown” we do not restart the VM under your hands. We can 
> argue how it should or may work, but this is the defined behavior 
> since the dawn of oVirt.
>
>>  AFAIK that's correct, we need to be able 
>>  shutdown HA VM
>>  
>>   without being it immediately restarted on different host. We want 
>> to restart HA VM only if host, where HA VM is running, is 
>>

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-16 Thread aleksey . maksimov

Тested.

If I run 'shutdown -h now' on host with running HA VM (not HostedEngine VM)...

in oVirt web-console appears event:

Sep 16, 2016 5:13:18 PM VM KOM-AD01-PBX02 is down. Exit message: User shut down 
from within the guest

HA VM is turned off and will not start on another host.

This journald log from HA VM guest OS:

...
Sep 16 17:06:48 KOM-AD01-PBX02 python[2637]: [100B blob data]
Sep 16 17:06:53 KOM-AD01-PBX02 systemd-timesyncd[1739]: Timed out waiting for 
reply from 91.189.91.157:123 (ntp.ubuntu.com).
Sep 16 17:07:03 KOM-AD01-PBX02 systemd-timesyncd[1739]: Timed out waiting for 
reply from 91.189.89.199:123 (ntp.ubuntu.com).
Sep 16 17:07:13 KOM-AD01-PBX02 systemd-timesyncd[1739]: Timed out waiting for 
reply from 91.189.89.198:123 (ntp.ubuntu.com).
Sep 16 17:07:23 KOM-AD01-PBX02 systemd-timesyncd[1739]: Timed out waiting for 
reply from 91.189.94.4:123 (ntp.ubuntu.com).
Sep 16 17:08:48 KOM-AD01-PBX02 python[2637]: [90B blob data]
Sep 16 17:08:49 KOM-AD01-PBX02 python[2637]: [155B blob data]
Sep 16 17:08:49 KOM-AD01-PBX02 python[2637]: [100B blob data]
Sep 16 17:10:49 KOM-AD01-PBX02 python[2637]: [90B blob data]
Sep 16 17:10:50 KOM-AD01-PBX02 python[2637]: [155B blob data]
Sep 16 17:10:50 KOM-AD01-PBX02 python[2637]: [100B blob data]
-- Reboot --
...

Before shutting down in the log no termination procedures.
It looks like a rough poweroff the VM

16.09.2016, 17:08, "Simone Tiraboschi" :
> On Fri, Sep 16, 2016 at 4:02 PM,  wrote:
>> So, colleagues.
>> I again tested the Fencing and now I think that my host-server power-button 
>> (physically or through ILO) sends a KILL-command to the host OS (and as a 
>> result to VM)
>> This journald log in my guest OS when I press the power-button on the host:
>>
>> ...
>> Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Stopping ACPI event daemon...
>> Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Stopping User Manager for UID 
>> 1000...
>> Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Starting Unattended Upgrades 
>> Shutdown...
>> Sep 16 16:19:27 KOM-AD01-PBX02 snapd[2583]: 2016/09/16 16:19:27.289063 
>> main.go:67: Exiting on terminated signal.
>> Sep 16 16:19:27 KOM-AD01-PBX02 sshd[2940]: pam_unix(sshd:session): session 
>> closed for user user
>> Sep 16 16:19:27 KOM-AD01-PBX02 su[3015]: pam_unix(su:session): session 
>> closed for user root
>> Sep 16 16:19:27 KOM-AD01-PBX02 spice-vdagentd[2638]: vdagentd quiting, 
>> returning status 0
>> Sep 16 16:19:27 KOM-AD01-PBX02 sudo[3014]: pam_unix(sudo:session): session 
>> closed for user root
>> Sep 16 16:19:27 KOM-AD01-PBX02 /usr/lib/snapd/snapd[2583]: main.go:67: 
>> Exiting on terminated signal.
>> Sep 16 16:19:27 KOM-AD01-PBX02 sshd[2812]: Received signal 15; terminating.
>> ...
>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Unmount All 
>> Filesystems.
>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped target Local File Systems 
>> (Pre).
>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopping Monitoring of LVM2 
>> mirrors, snapshots etc. using dmeventd or progress polling...
>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Remount Root and Kernel 
>> File Systems.
>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Create Static Device 
>> Nodes in /dev.
>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Shutdown.
>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Final Step.
>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Starting Reboot...
>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Monitoring of LVM2 
>> mirrors, snapshots etc. using dmeventd or progress polling.
>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Shutting down.
>> Sep 16 16:19:28 KOM-AD01-PBX02 kernel: [drm:qxl_enc_commit [qxl]] *ERROR* 
>> head number too large or missing monitors config: c984a000, 
>> 0systemd-shutdown[1]: Sending SIGTERM to remaining processes...
>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd-journald[3342]: Journal stopped
>> -- Reboot --
>>
>> Perhaps this feature of HP ProLiant DL 360 G5. I dont know.
>>
>> If I test the unavailability of a host other ways that everything is going 
>> well.
>>
>> I described my experience testing Fencing on practical examples on my blog 
>> for everyone in Russian.
>> https://blog.it-kb.ru/2016/09/16/install-ovirt-4-0-part-4-about-ssh-soft-fencing-and-hard-fencing-over-hp-proliant-ilo2-power-managment-agent-and-test-of-high-availability/
>>
>> Thank you all very much for your participation and support.
>>
>> Michal, what kind of scenario are you talking about?
>
> Basically what you just did,
> the question is what happens when you run 'shutdown -h now' (or press the 
> physical button if configured to trigger a soft shutdown); is it going to 
> propagate somehow the shutdown action to the VMs or to brutally kill them?
>
> In the first case the VMs will not restart regardless of their HA flags.
>
>> PS: Excuse me for my bad English :)
>>
>> 16.09.2016, 16:37, "Simone Tiraboschi"

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-16 Thread Simone Tiraboschi

On Fri, Sep 16, 2016 at 4:02 PM,  wrote:

> So, colleagues.
> I again tested the Fencing and now I think that my host-server
> power-button (physically or through ILO) sends a KILL-command to the host
> OS (and as a result to VM)
> This journald log in my guest OS when I press the power-button on the host:
>
> ...
> Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Stopping ACPI event daemon...
> Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Stopping User Manager for UID
> 1000...
> Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Starting Unattended Upgrades
> Shutdown...
> Sep 16 16:19:27 KOM-AD01-PBX02 snapd[2583]: 2016/09/16 16:19:27.289063
> main.go:67: Exiting on terminated signal.
> Sep 16 16:19:27 KOM-AD01-PBX02 sshd[2940]: pam_unix(sshd:session): session
> closed for user user
> Sep 16 16:19:27 KOM-AD01-PBX02 su[3015]: pam_unix(su:session): session
> closed for user root
> Sep 16 16:19:27 KOM-AD01-PBX02 spice-vdagentd[2638]: vdagentd quiting,
> returning status 0
> Sep 16 16:19:27 KOM-AD01-PBX02 sudo[3014]: pam_unix(sudo:session): session
> closed for user root
> Sep 16 16:19:27 KOM-AD01-PBX02 /usr/lib/snapd/snapd[2583]: main.go:67:
> Exiting on terminated signal.
> Sep 16 16:19:27 KOM-AD01-PBX02 sshd[2812]: Received signal 15; terminating.
> ...
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Unmount All
> Filesystems.
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped target Local File
> Systems (Pre).
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopping Monitoring of LVM2
> mirrors, snapshots etc. using dmeventd or progress polling...
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Remount Root and Kernel
> File Systems.
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Create Static Device
> Nodes in /dev.
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Shutdown.
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Final Step.
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Starting Reboot...
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Monitoring of LVM2
> mirrors, snapshots etc. using dmeventd or progress polling.
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Shutting down.
> Sep 16 16:19:28 KOM-AD01-PBX02 kernel: [drm:qxl_enc_commit [qxl]] *ERROR*
> head number too large or missing monitors config: c984a000,
> 0systemd-shutdown[1]: Sending SIGTERM to remaining processes...
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd-journald[3342]: Journal stopped
> -- Reboot --
>
> Perhaps this feature of HP ProLiant DL 360 G5. I dont know.
>
> If I test the unavailability of a host other ways that everything is going
> well.
>
> I described my experience testing Fencing on practical examples on my blog
> for everyone in Russian.
> https://blog.it-kb.ru/2016/09/16/install-ovirt-4-0-part-4-
> about-ssh-soft-fencing-and-hard-fencing-over-hp-proliant-
> ilo2-power-managment-agent-and-test-of-high-availability/
>
>
> Thank you all very much for your participation and support.
>
> Michal, what kind of scenario are you talking about?
>

Basically what you just did,
the question is what happens when you run 'shutdown -h now' (or press the
physical button if configured to trigger a soft shutdown); is it going to
propagate somehow the shutdown action to the VMs or to brutally kill them?

In the first case the VMs will not restart regardless of their HA flags.


>
>
> PS: Excuse me for my bad English :)
>
>
> 16.09.2016, 16:37, "Simone Tiraboschi" :
> > On Fri, Sep 16, 2016 at 3:34 PM, Michal Skrivanek <
> michal.skriva...@redhat.com> wrote:
> >>> On 16 Sep 2016, at 15:31, aleksey.maksi...@it-kb.ru wrote:
> >>>
> >>> Hi Simone.
> >>> Exactly.
> >>> Now I'll put the journald on the guest and try to understand how the
> guest off.
> >>
> >> great. thanks
> >>
> >>> 16.09.2016, 16:25, "Simone Tiraboschi" :
>  On Fri, Sep 16, 2016 at 3:13 PM, Michal Skrivanek <
> michal.skriva...@redhat.com> wrote:
> >> On 16 Sep 2016, at 15:05, Gianluca Cecchi <
> gianluca.cec...@gmail.com> wrote:
> >>
> >> On Fri, Sep 16, 2016 at 2:50 PM, Michal Skrivanek <
> michal.skriva...@redhat.com> wrote:
> >>> no, that’s not how HA works today. When you log into a guest and
> issue “shutdown” we do not restart the VM under your hands. We can argue
> how it should or may work, but this is the defined behavior since the dawn
> of oVirt.
> >>>
>  AFAIK that's correct, we need to be able 
>  shutdown HA VM
>  
>   without being it immediately restarted on different host. We
> want to restart HA VM only if host, where HA VM is running, is
> non-responsive.
> >>>
> >>> we try to restart it in all other cases other than user initiated
> shutdown, e.g. a QEMU process crash on an otherwise-healthy host
> >> Hi, just another question in case HA is not configured at all.
> >
> > by “HA configured” I expect you’re referring to the “Highly
> Available” checkbox in Edit VM

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-16 Thread aleksey . maksimov

So, colleagues. 
I again tested the Fencing and now I think that my host-server power-button 
(physically or through ILO) sends a KILL-command to the host OS (and as a 
result to VM)
This journald log in my guest OS when I press the power-button on the host:

...
Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Stopping ACPI event daemon...
Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Stopping User Manager for UID 1000...
Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Starting Unattended Upgrades 
Shutdown...
Sep 16 16:19:27 KOM-AD01-PBX02 snapd[2583]: 2016/09/16 16:19:27.289063 
main.go:67: Exiting on terminated signal.
Sep 16 16:19:27 KOM-AD01-PBX02 sshd[2940]: pam_unix(sshd:session): session 
closed for user user
Sep 16 16:19:27 KOM-AD01-PBX02 su[3015]: pam_unix(su:session): session closed 
for user root
Sep 16 16:19:27 KOM-AD01-PBX02 spice-vdagentd[2638]: vdagentd quiting, 
returning status 0
Sep 16 16:19:27 KOM-AD01-PBX02 sudo[3014]: pam_unix(sudo:session): session 
closed for user root
Sep 16 16:19:27 KOM-AD01-PBX02 /usr/lib/snapd/snapd[2583]: main.go:67: Exiting 
on terminated signal.
Sep 16 16:19:27 KOM-AD01-PBX02 sshd[2812]: Received signal 15; terminating.
...
Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Unmount All 
Filesystems.
Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped target Local File Systems 
(Pre).
Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopping Monitoring of LVM2 mirrors, 
snapshots etc. using dmeventd or progress polling...
Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Remount Root and Kernel File 
Systems.
Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Create Static Device Nodes 
in /dev.
Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Shutdown.
Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Final Step.
Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Starting Reboot...
Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Monitoring of LVM2 mirrors, 
snapshots etc. using dmeventd or progress polling.
Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Shutting down.
Sep 16 16:19:28 KOM-AD01-PBX02 kernel: [drm:qxl_enc_commit [qxl]] *ERROR* head 
number too large or missing monitors config: c984a000, 
0systemd-shutdown[1]: Sending SIGTERM to remaining processes...
Sep 16 16:19:28 KOM-AD01-PBX02 systemd-journald[3342]: Journal stopped
-- Reboot --

Perhaps this feature of HP ProLiant DL 360 G5. I dont know.

If I test the unavailability of a host other ways that everything is going well.

I described my experience testing Fencing on practical examples on my blog for 
everyone in Russian.
https://blog.it-kb.ru/2016/09/16/install-ovirt-4-0-part-4-about-ssh-soft-fencing-and-hard-fencing-over-hp-proliant-ilo2-power-managment-agent-and-test-of-high-availability/


Thank you all very much for your participation and support.

Michal, what kind of scenario are you talking about?


PS: Excuse me for my bad English :)


16.09.2016, 16:37, "Simone Tiraboschi" :
> On Fri, Sep 16, 2016 at 3:34 PM, Michal Skrivanek 
>  wrote:
>>> On 16 Sep 2016, at 15:31, aleksey.maksi...@it-kb.ru wrote:
>>>
>>> Hi Simone.
>>> Exactly.
>>> Now I'll put the journald on the guest and try to understand how the guest 
>>> off.
>>
>> great. thanks
>>
>>> 16.09.2016, 16:25, "Simone Tiraboschi" :
 On Fri, Sep 16, 2016 at 3:13 PM, Michal Skrivanek 
  wrote:
>> On 16 Sep 2016, at 15:05, Gianluca Cecchi  
>> wrote:
>>
>> On Fri, Sep 16, 2016 at 2:50 PM, Michal Skrivanek 
>>  wrote:
>>> no, that’s not how HA works today. When you log into a guest and issue 
>>> “shutdown” we do not restart the VM under your hands. We can argue how 
>>> it should or may work, but this is the defined behavior since the dawn 
>>> of oVirt.
>>>
 AFAIK that's correct, we need to be able 
 shutdown HA VM
 
  without being it immediately restarted on different host. We want to 
 restart HA VM only if host, where HA VM is running, is non-responsive.
>>>
>>> we try to restart it in all other cases other than user initiated 
>>> shutdown, e.g. a QEMU process crash on an otherwise-healthy host
>> Hi, just another question in case HA is not configured at all.
>
> by “HA configured” I expect you’re referring to the “Highly Available” 
> checkbox in Edit VM dialog.
>
>> If I run the "shutdown -h now" command on an host where some VMs are 
>> running, what is the expected behavior?
>> Clean VM shutdown (with or without timeout in case it doesn't complete?) 
>> or crash of their related QEMU processes?
>
> expectation is that you won’t do that. That’s why there is the 
> Maintenance host state.
> But if you do that regardless, with VMs running, all the processes will 
> be terminated in a regular system way, i.e. all

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-16 Thread Michal Skrivanek


> On 16 Sep 2016, at 16:02, aleksey.maksi...@it-kb.ru wrote:
> 
> So, colleagues. 
> I again tested the Fencing and now I think that my host-server power-button 
> (physically or through ILO) sends a KILL-command to the host OS (and as a 
> result to VM)

thanks for confirmation, then it is indeed 
https://bugzilla.redhat.com/show_bug.cgi?id=1341106

I’m not sure if there is any good workaround. You can always 
reconfigure(disable) ACPI in the guest, then HA logic would work ok but it also 
means there is no graceful shutdown and your VM would be killed uncleanly. 

> This journald log in my guest OS when I press the power-button on the host:
> 
> ..
> Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Stopping ACPI event daemon...
> Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Stopping User Manager for UID 
> 1000...
> Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Starting Unattended Upgrades 
> Shutdown...
> Sep 16 16:19:27 KOM-AD01-PBX02 snapd[2583]: 2016/09/16 16:19:27.289063 
> main.go:67: Exiting on terminated signal.
> Sep 16 16:19:27 KOM-AD01-PBX02 sshd[2940]: pam_unix(sshd:session): session 
> closed for user user
> Sep 16 16:19:27 KOM-AD01-PBX02 su[3015]: pam_unix(su:session): session closed 
> for user root
> Sep 16 16:19:27 KOM-AD01-PBX02 spice-vdagentd[2638]: vdagentd quiting, 
> returning status 0
> Sep 16 16:19:27 KOM-AD01-PBX02 sudo[3014]: pam_unix(sudo:session): session 
> closed for user root
> Sep 16 16:19:27 KOM-AD01-PBX02 /usr/lib/snapd/snapd[2583]: main.go:67: 
> Exiting on terminated signal.
> Sep 16 16:19:27 KOM-AD01-PBX02 sshd[2812]: Received signal 15; terminating.
> ..
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Unmount All 
> Filesystems.
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped target Local File Systems 
> (Pre).
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopping Monitoring of LVM2 
> mirrors, snapshots etc. using dmeventd or progress polling...
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Remount Root and Kernel 
> File Systems.
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Create Static Device Nodes 
> in /dev.
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Shutdown.
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Final Step.
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Starting Reboot...
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Monitoring of LVM2 
> mirrors, snapshots etc. using dmeventd or progress polling.
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Shutting down.
> Sep 16 16:19:28 KOM-AD01-PBX02 kernel: [drm:qxl_enc_commit [qxl]] *ERROR* 
> head number too large or missing monitors config: c984a000, 
> 0systemd-shutdown[1]: Sending SIGTERM to remaining processes...
> Sep 16 16:19:28 KOM-AD01-PBX02 systemd-journald[3342]: Journal stopped
> -- Reboot --
> 
> Perhaps this feature of HP ProLiant DL 360 G5. I dont know.
> 
> If I test the unavailability of a host other ways that everything is going 
> well.
> 
> I described my experience testing Fencing on practical examples on my blog 
> for everyone in Russian.
> https://blog.it-kb.ru/2016/09/16/install-ovirt-4-0-part-4-about-ssh-soft-fencing-and-hard-fencing-over-hp-proliant-ilo2-power-managment-agent-and-test-of-high-availability/
> 
> 
> Thank you all very much for your participation and support.
> 
> Michal, what kind of scenario are you talking about?
> 
> 
> PS: Excuse me for my bad English :)
> 
> 
> 16.09.2016, 16:37, "Simone Tiraboschi" :
>> On Fri, Sep 16, 2016 at 3:34 PM, Michal Skrivanek 
>>  wrote:
 On 16 Sep 2016, at 15:31, aleksey.maksi...@it-kb.ru wrote:
 
 Hi Simone.
 Exactly.
 Now I'll put the journald on the guest and try to understand how the guest 
 off.
>>> 
>>> great. thanks
>>> 
 16.09.2016, 16:25, "Simone Tiraboschi" :
> On Fri, Sep 16, 2016 at 3:13 PM, Michal Skrivanek 
>  wrote:
>>> On 16 Sep 2016, at 15:05, Gianluca Cecchi  
>>> wrote:
>>> 
>>> On Fri, Sep 16, 2016 at 2:50 PM, Michal Skrivanek 
>>>  wrote:
 no, that’s not how HA works today. When you log into a guest and issue 
 “shutdown” we do not restart the VM under your hands. We can argue how 
 it should or may work, but this is the defined behavior since the dawn 
 of oVirt.
 
> AFAIK that's correct, we need to be able 
> shutdown HA VM
> 
>  without being it immediately restarted on different host. We want 
> to restart HA VM only if host, where HA VM is running, is 
> non-responsive.
 
 we try to restart it in all other cases other than user initiated 
 shutdown, e.g. a QEMU process crash on an otherwise-healthy host
>>> Hi, just another question in case HA is not configured at all.
>> 
>> by “HA configured” I expect you’re

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-16 Thread Michal Skrivanek


> On 16 Sep 2016, at 15:31, aleksey.maksi...@it-kb.ru wrote:
> 
> Hi Simone.
> Exactly.
> Now I'll put the journald on the guest and try to understand how the guest 
> off.

great. thanks

>  
> 16.09.2016, 16:25, "Simone Tiraboschi" :
>>  
>>  
>> On Fri, Sep 16, 2016 at 3:13 PM, Michal Skrivanek 
>> > wrote:
>>  
>>> On 16 Sep 2016, at 15:05, Gianluca Cecchi >> > wrote:
>>>  
>>> On Fri, Sep 16, 2016 at 2:50 PM, Michal Skrivanek 
>>> > wrote:
>>>  
>>> no, that’s not how HA works today. When you log into a guest and issue 
>>> “shutdown” we do not restart the VM under your hands. We can argue how it 
>>> should or may work, but this is the defined behavior since the dawn of 
>>> oVirt.
>>>  
 
  AFAIK that's correct, we need to be able shutdown HA VM without 
 being it immediately restarted on different host. We want to restart HA VM 
 only if host, where HA VM is running, is non-responsive.
>>>  
>>> we try to restart it in all other cases other than user initiated shutdown, 
>>> e.g. a QEMU process crash on an otherwise-healthy host
>>>  
>>> Hi, just another question in case HA is not configured at all.
>>  
>> by “HA configured” I expect you’re referring to the “Highly Available” 
>> checkbox in Edit VM dialog.
>>  
>>> 
>>> If I run the "shutdown -h now" command on an host where some VMs are 
>>> running, what is the expected behavior?
>>> Clean VM shutdown (with or without timeout in case it doesn't complete?) or 
>>> crash of their related QEMU processes?
>>  
>> expectation is that you won’t do that. That’s why there is the Maintenance 
>> host state.
>> But if you do that regardless, with VMs running, all the processes will be 
>> terminated in a regular system way, i.e. all QEMU processes get SIGTERM. 
>> From the perspective of each guest this is not a clean shutdown and it would 
>> just get killed 
>>  
>>  
>> Aleksey is reporting that he started a shutdown on his host by power 
>> management and the VM processes didn't get roughly killed but smoothly shut 
>> down and so they didn't restarted regardless of their HA flag and so this 
>> thread. 

Gianluca talks about “shutdown -h now”, you talk about power management action, 
those are two different things. The current idea is that systemd or some other 
component just propagates the action to the guest and if that guest is 
configured to handle it as a shutdown it starts it itself as well so it looks 
like a user-initiated one. Even though this mostly makes sense it is not ok for 
current HA logic

>>  
>>  
>> Thanks,
>> michal
>>> 
>>>  
>>> Thanks,
>>> Gianluca
>>> ___
>>> Users mailing list
>>> Users@ovirt.org 
>>> http://lists.ovirt.org/mailman/listinfo/users 
>>> 
>> ___
>> Users mailing list
>> Users@ovirt.org 
>> http://lists.ovirt.org/mailman/listinfo/users 
>> 
>>  
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-16 Thread Simone Tiraboschi

On Fri, Sep 16, 2016 at 3:34 PM, Michal Skrivanek <
michal.skriva...@redhat.com> wrote:

>
> On 16 Sep 2016, at 15:31, aleksey.maksi...@it-kb.ru wrote:
>
> Hi Simone.
> Exactly.
> Now I'll put the journald on the guest and try to understand how the guest
> off.
>
>
> great. thanks
>
>
> 16.09.2016, 16:25, "Simone Tiraboschi" :
>
>
>
> On Fri, Sep 16, 2016 at 3:13 PM, Michal Skrivanek <
> michal.skriva...@redhat.com> wrote:
>
>
>
> On 16 Sep 2016, at 15:05, Gianluca Cecchi 
> wrote:
>
> On Fri, Sep 16, 2016 at 2:50 PM, Michal Skrivanek <
> michal.skriva...@redhat.com> wrote:
>
>
> no, that’s not how HA works today. When you log into a guest and issue
> “shutdown” we do not restart the VM under your hands. We can argue how it
> should or may work, but this is the defined behavior since the dawn of
> oVirt.
>
>
>
> AFAIK that's correct, we need to be able 
> shutdown HA VM
> 
>  without being it immediately restarted on different host. We want to
> restart HA VM only if host, where HA VM is running, is non-responsive.
>
>
> we try to restart it in all other cases other than user initiated
> shutdown, e.g. a QEMU process crash on an otherwise-healthy host
>
>
> Hi, just another question in case HA is not configured at all.
>
>
> by “HA configured” I expect you’re referring to the “Highly Available”
> checkbox in Edit VM dialog.
>
>
> If I run the "shutdown -h now" command on an host where some VMs are
> running, what is the expected behavior?
> Clean VM shutdown (with or without timeout in case it doesn't complete?)
> or crash of their related QEMU processes?
>
>
> expectation is that you won’t do that. That’s why there is the Maintenance
> host state.
> But if you do that regardless, with VMs running, all the processes will be
> terminated in a regular system way, i.e. all QEMU processes get SIGTERM.
> From the perspective of each guest this is not a clean shutdown and it
> would just get killed
>
>
>
> Aleksey is reporting that he started a shutdown on his host by power
> management and the VM processes didn't get roughly killed but smoothly shut
> down and so they didn't restarted regardless of their HA flag and so this
> thread.
>
>
> Gianluca talks about “shutdown -h now”, you talk about power management
> action, those are two different things. The current idea is that systemd or
> some other component just propagates the action to the guest and if that
> guest is configured to handle it as a shutdown it starts it itself as well
> so it looks like a user-initiated one. Even though this mostly makes sense
> it is not ok for current HA logic
>
>
Aleksey, can you please also test this scenario?

>
>
>
> Thanks,
> michal
>
>
> Thanks,
> Gianluca
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-16 Thread aleksey . maksimov

Hi Simone.Exactly.Now I'll put the journald on the guest and try to understand how the guest off. 16.09.2016, 16:25, "Simone Tiraboschi" :  On Fri, Sep 16, 2016 at 3:13 PM, Michal Skrivanek  wrote: On 16 Sep 2016, at 15:05, Gianluca Cecchi  wrote: On Fri, Sep 16, 2016 at 2:50 PM, Michal Skrivanek  wrote: no, that’s not how HA works today. When you log into a guest and issue “shutdown” we do not restart the VM under your hands. We can argue how it should or may work, but this is the defined behavior since the dawn of oVirt.  AFAIK that's correct, we need to be able shutdown HA VM without being it immediately restarted on different host. We want to restart HA VM only if host, where HA VM is running, is non-responsive. we try to restart it in all other cases other than user initiated shutdown, e.g. a QEMU process crash on an otherwise-healthy host Hi, just another question in case HA is not configured at all. by “HA configured” I expect you’re referring to the “Highly Available” checkbox in Edit VM dialog. If I run the "shutdown -h now" command on an host where some VMs are running, what is the expected behavior?Clean VM shutdown (with or without timeout in case it doesn't complete?) or crash of their related QEMU processes? expectation is that you won’t do that. That’s why there is the Maintenance host state.But if you do that regardless, with VMs running, all the processes will be terminated in a regular system way, i.e. all QEMU processes get SIGTERM. From the perspective of each guest this is not a clean shutdown and it would just get killed   Aleksey is reporting that he started a shutdown on his host by power management and the VM processes didn't get roughly killed but smoothly shut down and so they didn't restarted regardless of their HA flag and so this thread.   Thanks,michal Thanks,Gianluca___Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users___Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users ___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-16 Thread Simone Tiraboschi

On Fri, Sep 16, 2016 at 3:13 PM, Michal Skrivanek <
michal.skriva...@redhat.com> wrote:

>
> On 16 Sep 2016, at 15:05, Gianluca Cecchi 
> wrote:
>
> On Fri, Sep 16, 2016 at 2:50 PM, Michal Skrivanek <
> michal.skriva...@redhat.com> wrote:
>
>>
>> no, that’s not how HA works today. When you log into a guest and issue
>> “shutdown” we do not restart the VM under your hands. We can argue how it
>> should or may work, but this is the defined behavior since the dawn of
>> oVirt.
>>
>>
>> AFAIK that's correct, we need to be able 
>> shutdown HA VM
>> 
>>  without being it immediately restarted on different host. We want to
>> restart HA VM only if host, where HA VM is running, is non-responsive.
>>
>>
>> we try to restart it in all other cases other than user initiated
>> shutdown, e.g. a QEMU process crash on an otherwise-healthy host
>>
>>
> Hi, just another question in case HA is not configured at all.
>
>
> by “HA configured” I expect you’re referring to the “Highly Available”
> checkbox in Edit VM dialog.
>
> If I run the "shutdown -h now" command on an host where some VMs are
> running, what is the expected behavior?
> Clean VM shutdown (with or without timeout in case it doesn't complete?)
> or crash of their related QEMU processes?
>
>
> expectation is that you won’t do that. That’s why there is the Maintenance
> host state.
> But if you do that regardless, with VMs running, all the processes will be
> terminated in a regular system way, i.e. all QEMU processes get SIGTERM.
> From the perspective of each guest this is not a clean shutdown and it
> would just get killed
>
>
Aleksey is reporting that he started a shutdown on his host by power
management and the VM processes didn't get roughly killed but smoothly shut
down and so they didn't restarted regardless of their HA flag and so this
thread.


> Thanks,
> michal
>
>
> Thanks,
> Gianluca
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-16 Thread Gianluca Cecchi

On Fri, Sep 16, 2016 at 3:13 PM, Michal Skrivanek <
michal.skriva...@redhat.com> wrote:

>
> On 16 Sep 2016, at 15:05, Gianluca Cecchi 
> wrote:
>
> On Fri, Sep 16, 2016 at 2:50 PM, Michal Skrivanek <
> michal.skriva...@redhat.com> wrote:
>
>>
>> no, that’s not how HA works today. When you log into a guest and issue
>> “shutdown” we do not restart the VM under your hands. We can argue how it
>> should or may work, but this is the defined behavior since the dawn of
>> oVirt.
>>
>>
>> AFAIK that's correct, we need to be able 
>> shutdown HA VM
>> 
>>  without being it immediately restarted on different host. We want to
>> restart HA VM only if host, where HA VM is running, is non-responsive.
>>
>>
>> we try to restart it in all other cases other than user initiated
>> shutdown, e.g. a QEMU process crash on an otherwise-healthy host
>>
>>
> Hi, just another question in case HA is not configured at all.
>
>
> by “HA configured” I expect you’re referring to the “Highly Available”
> checkbox in Edit VM dialog.
>

Yes


>
> If I run the "shutdown -h now" command on an host where some VMs are
> running, what is the expected behavior?
> Clean VM shutdown (with or without timeout in case it doesn't complete?)
> or crash of their related QEMU processes?
>
>
> expectation is that you won’t do that. That’s why there is the Maintenance
> host state.
> But if you do that regardless, with VMs running, all the processes will be
> terminated in a regular system way, i.e. all QEMU processes get SIGTERM.
> From the perspective of each guest this is not a clean shutdown and it
> would just get killed
>
>
Yes, I was thinking about the scenario of one guy issuing the command (or
pressing the button) by mistake.
Thanks,
Gianluca
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-16 Thread Michal Skrivanek


> On 16 Sep 2016, at 15:05, Gianluca Cecchi  wrote:
> 
> On Fri, Sep 16, 2016 at 2:50 PM, Michal Skrivanek 
> > wrote:
> 
> no, that’s not how HA works today. When you log into a guest and issue 
> “shutdown” we do not restart the VM under your hands. We can argue how it 
> should or may work, but this is the defined behavior since the dawn of oVirt.
> 
>> 
>> AFAIK that's correct, we need to be able shutdown HA VM without being 
>> it immediately restarted on different host. We want to restart HA VM only if 
>> host, where HA VM is running, is non-responsive.
> 
> we try to restart it in all other cases other than user initiated shutdown, 
> e.g. a QEMU process crash on an otherwise-healthy host
> 
> 
> Hi, just another question in case HA is not configured at all.

by “HA configured” I expect you’re referring to the “Highly Available” checkbox 
in Edit VM dialog.

> If I run the "shutdown -h now" command on an host where some VMs are running, 
> what is the expected behavior?
> Clean VM shutdown (with or without timeout in case it doesn't complete?) or 
> crash of their related QEMU processes?

expectation is that you won’t do that. That’s why there is the Maintenance host 
state.
But if you do that regardless, with VMs running, all the processes will be 
terminated in a regular system way, i.e. all QEMU processes get SIGTERM. From 
the perspective of each guest this is not a clean shutdown and it would just 
get killed 

Thanks,
michal
> 
> Thanks,
> Gianluca
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-16 Thread Gianluca Cecchi

On Fri, Sep 16, 2016 at 2:50 PM, Michal Skrivanek <
michal.skriva...@redhat.com> wrote:

>
> no, that’s not how HA works today. When you log into a guest and issue
> “shutdown” we do not restart the VM under your hands. We can argue how it
> should or may work, but this is the defined behavior since the dawn of
> oVirt.
>
>
> AFAIK that's correct, we need to be able 
> shutdown HA VM
> 
>  without being it immediately restarted on different host. We want to
> restart HA VM only if host, where HA VM is running, is non-responsive.
>
>
> we try to restart it in all other cases other than user initiated
> shutdown, e.g. a QEMU process crash on an otherwise-healthy host
>
>
Hi, just another question in case HA is not configured at all.
If I run the "shutdown -h now" command on an host where some VMs are
running, what is the expected behavior?
Clean VM shutdown (with or without timeout in case it doesn't complete?) or
crash of their related QEMU processes?

Thanks,
Gianluca
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-16 Thread Michal Skrivanek


> On 16 Sep 2016, at 14:23, Martin Perina  wrote:
> 
> 
> 
> On Fri, Sep 16, 2016 at 1:54 PM, Simone Tiraboschi  > wrote:
> 
> 
> On Fri, Sep 16, 2016 at 12:50 PM, Martin Perina  > wrote:
> 
> 
> On Fri, Sep 16, 2016 at 9:26 AM, Michal Skrivanek 
> > wrote:
> 
> > On 16 Sep 2016, at 08:29, aleksey.maksi...@it-kb.ru 
> >  wrote:
> >
> > There are more ideas?
> >
> > 15.09.2016, 14:40, "aleksey.maksi...@it-kb.ru 
> > "  > >:
> >> Martin, I physically turned off the server through the iLO2. See 
> >> screenshots.
> >> I did not touch Virtual Machine (KOM-AD01-PBX02) at the same time.
> >> The virtual machine has been turned on at the time when the host shut down.
> >>
> >> 15.09.2016, 14:27, "Martin Perina"  >> >:
> >>>  Hi,
> >>>
> >>>  I found out this in the log:
> >>>
> >>>  2016-09-15 12:02:04,661 INFO  
> >>> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] 
> >>> (ForkJoinPool-1-worker-6) [] VM 
> >>> '660bafca-e9c3-4191-99b4-295ff8553488'(KOM-AD01-PBX02) moved from 'Up' 
> >>> --> 'Down'
> >>>  2016-09-15 12:02:04,788 INFO  
> >>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
> >>> (ForkJoinPool-1-worker-6) [] Correlation ID: null, Call Stack: null, 
> >>> Custom Event ID: -1, Message: VM KOM-AD01-PBX02 is down. Exit message: 
> >>> User shut down from within the guest
> 
> since it shut down cleanly, can you please check the guest's logs to see what 
> triggered the shutdown? In such cases it is considered a user requested 
> shutdown and such VMs are not restarted automatically
> 
> That's exactly what I meant by my response. From the log it's obvious that 
> VM was shutdown properly, so engine will not restart it on a different. host. 
> Also on most modern hosts if you execute power management off action, a 
> signal is sent to OS to execute  regular shutdown so VMs are also shutted 
> down properly.
> 
> I understand the reason, but is it really what the user expects?
> 
> I mean, if I set HA mode on a VM I'd expect the that the engine cares to keep 
> it up of restart if needed regardless of shutdown reasons.

no, that’s not how HA works today. When you log into a guest and issue 
“shutdown” we do not restart the VM under your hands. We can argue how it 
should or may work, but this is the defined behavior since the dawn of oVirt.

> 
> AFAIK that's correct, we need to be able shutdown HA VM without being 
> it immediately restarted on different host. We want to restart HA VM only if 
> host, where HA VM is running, is non-responsive.

we try to restart it in all other cases other than user initiated shutdown, 
e.g. a QEMU process crash on an otherwise-healthy host

> 
> For instance, on hosted-engine the HA agent, if not in global maintenance 
> mode, will restart the engine VM regardless of who or why it went off.
> 
> Well, HE VM is definitely not a standard HA VM :-)
>  
> 
>  
> 
> We are aware of a similar issue on specific hw - 
> https://bugzilla.redhat.com/show_bug.cgi?id=1341106 
> 
> 
> >>>
> >>>  If I'm not mistaken, this means that VM was properly shutted down from 
> >>> within itself and in that case it's not restarted automatically. So I'm 
> >>> curious what actions have you made to make host KOM-AD01-VM31 
> >>> non-responsive?
> >>>
> >>>  If you want to test fencing properly, then I suggest you to either block 
> >>> connection between host and engine on host side and forcibly stop 
> >>> ovirtmgmt network interface on host and watch fencing is applied.
> 
> Try above if you want to test fencing. Of course you can always configure 
> firewall rule to drop all packets between engine and host or unplug host 
> network cable.
> 
> >>>
> >>>  Martin
> >>>
> >>>  On Thu, Sep 15, 2016 at 1:16 PM,  >>> > wrote:
>   engine.log for this period.
> 
>   15.09.2016, 14:01, "Martin Perina"   >:
> >  On Thu, Sep 15, 2016 at 12:47 PM,  > > wrote:
> >>  Hi Martin.
> >>  I have a stupid question. Use Watchdog device mandatory to 
> >> automatically start a virtual machine in host Fencing process?
> >
> >  AFAIK it's not, but I'm not na expert, adding Arik.
> >
> >  You need correct power management setup for the hosts and VM has to be 
> > marked as highly available for sure.
> >
> >>  15.09.2016, 13:43, "Martin Perina"  >> >:
> >>>  Hi,
>

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-16 Thread Martin Perina

On Fri, Sep 16, 2016 at 1:54 PM, Simone Tiraboschi 
wrote:

>
>
> On Fri, Sep 16, 2016 at 12:50 PM, Martin Perina 
> wrote:
>
>>
>>
>> On Fri, Sep 16, 2016 at 9:26 AM, Michal Skrivanek <
>> michal.skriva...@redhat.com> wrote:
>>
>>>
>>> > On 16 Sep 2016, at 08:29, aleksey.maksi...@it-kb.ru wrote:
>>> >
>>> > There are more ideas?
>>> >
>>> > 15.09.2016, 14:40, "aleksey.maksi...@it-kb.ru" <
>>> aleksey.maksi...@it-kb.ru>:
>>> >> Martin, I physically turned off the server through the iLO2. See
>>> screenshots.
>>> >> I did not touch Virtual Machine (KOM-AD01-PBX02) at the same time.
>>> >> The virtual machine has been turned on at the time when the host shut
>>> down.
>>> >>
>>> >> 15.09.2016, 14:27, "Martin Perina" :
>>> >>>  Hi,
>>> >>>
>>> >>>  I found out this in the log:
>>> >>>
>>> >>>  2016-09-15 12:02:04,661 INFO  
>>> >>> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>>> (ForkJoinPool-1-worker-6) [] VM 
>>> '660bafca-e9c3-4191-99b4-295ff8553488'(KOM-AD01-PBX02)
>>> moved from 'Up' --> 'Down'
>>> >>>  2016-09-15 12:02:04,788 INFO  [org.ovirt.engine.core.dal.dbb
>>> roker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-6) []
>>> Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM
>>> KOM-AD01-PBX02 is down. Exit message: User shut down from within the guest
>>>
>>> since it shut down cleanly, can you please check the guest's logs to see
>>> what triggered the shutdown? In such cases it is considered a user
>>> requested shutdown and such VMs are not restarted automatically
>>>
>>
>> That's exactly what I meant by my response. From the log it's obvious
>> that VM was shutdown properly, so engine will not restart it on a
>> different. host. Also on most modern hosts if you execute power management
>> off action, a signal is sent to OS to execute 
>>
>> regular shutdown so VMs are also shutted down properly.
>>
>
> I understand the reason, but is it really what the user expects?
>
> I mean, if I set HA mode on a VM I'd expect the that the engine cares to
> keep it up of restart if needed regardless of shutdown reasons.
>

AFAIK that's correct, we need to be able 
shutdown HA VM

 without being it immediately restarted on different host. We want to
restart HA VM only if host, where HA VM is running, is non-responsive.

For instance, on hosted-engine the HA agent, if not in global maintenance
> mode, will restart the engine VM regardless of who or why it went off.
>

Well, HE VM is definitely not a standard HA VM :-)



>
>
>
>> 
>>
>>> We are aware of a similar issue on specific hw -
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1341106
>>>
>>> >>>
>>> >>>  If I'm not mistaken, this means that VM was properly shutted down
>>> from within itself and in that case it's not restarted automatically. So
>>> I'm curious what actions have you made to make host KOM-AD01-VM31
>>> non-responsive?
>>> >>>
>>> >>>  If you want to test fencing properly, then I suggest you to either
>>> block connection between host and engine on host side and forcibly stop
>>> ovirtmgmt network interface on host and watch fencing is applied.
>>>
>>
>> Try above if you want to test fencing. Of course you can always
>> configure firewall rule to drop all packets between engine and host or
>> unplug host network cable.
>>
>> >>>
>>> >>>  Martin
>>> >>>
>>> >>>  On Thu, Sep 15, 2016 at 1:16 PM,  wrote:
>>>   engine.log for this period.
>>> 
>>>   15.09.2016, 14:01, "Martin Perina" :
>>> >  On Thu, Sep 15, 2016 at 12:47 PM, 
>>> wrote:
>>> >>  Hi Martin.
>>> >>  I have a stupid question. Use Watchdog device mandatory to
>>> automatically start a virtual machine in host Fencing process?
>>> >
>>> >  AFAIK it's not, but I'm not na expert, adding Arik.
>>> >
>>> >  You need correct power management setup for the hosts and VM has
>>> to be marked as highly available for sure.
>>> >
>>> >>  15.09.2016, 13:43, "Martin Perina" :
>>> >>>  Hi,
>>> >>>
>>> >>>  could you please share whole engine.log?
>>> >>>
>>> >>>  Thanks
>>> >>>
>>> >>>  Martin Perina
>>> >>>
>>> >>>  On Thu, Sep 15, 2016 at 12:01 PM, 
>>> wrote:
>>>   Hello oVirt guru`s !
>>> 
>>>   I have oVirt Hosted Engine 4.0.3-1.el7.centos on two CentOS
>>> 7.2 hosts (HP ProLiant DL 360 G5) connected to shared FC SAN Storage.
>>> 
>>>   1. I configured Power Management for the Hosts (successfully
>>> added Fencing Agent for iLO2 from my hosts)
>>> 
>>>   2. I created new VM (KOM-AD01-PBX02) and installed Guest OS
>>> (Ubuntu Server 16.04 LTS) and oVirt Guest Agent
>>>   (As described herein https://blog.it-kb.ru/2016/09/
>>> 14/install-ovirt-4-0-part-2-about-data-center-iso-domain-log
>>>

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-16 Thread Simone Tiraboschi

On Fri, Sep 16, 2016 at 12:50 PM, Martin Perina  wrote:

>
>
> On Fri, Sep 16, 2016 at 9:26 AM, Michal Skrivanek <
> michal.skriva...@redhat.com> wrote:
>
>>
>> > On 16 Sep 2016, at 08:29, aleksey.maksi...@it-kb.ru wrote:
>> >
>> > There are more ideas?
>> >
>> > 15.09.2016, 14:40, "aleksey.maksi...@it-kb.ru" <
>> aleksey.maksi...@it-kb.ru>:
>> >> Martin, I physically turned off the server through the iLO2. See
>> screenshots.
>> >> I did not touch Virtual Machine (KOM-AD01-PBX02) at the same time.
>> >> The virtual machine has been turned on at the time when the host shut
>> down.
>> >>
>> >> 15.09.2016, 14:27, "Martin Perina" :
>> >>>  Hi,
>> >>>
>> >>>  I found out this in the log:
>> >>>
>> >>>  2016-09-15 12:02:04,661 INFO  
>> >>> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>> (ForkJoinPool-1-worker-6) [] VM 
>> '660bafca-e9c3-4191-99b4-295ff8553488'(KOM-AD01-PBX02)
>> moved from 'Up' --> 'Down'
>> >>>  2016-09-15 12:02:04,788 INFO  [org.ovirt.engine.core.dal.dbb
>> roker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-6) []
>> Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM
>> KOM-AD01-PBX02 is down. Exit message: User shut down from within the guest
>>
>> since it shut down cleanly, can you please check the guest's logs to see
>> what triggered the shutdown? In such cases it is considered a user
>> requested shutdown and such VMs are not restarted automatically
>>
>
> That's exactly what I meant by my response. From the log it's obvious
> that VM was shutdown properly, so engine will not restart it on a
> different. host. Also on most modern hosts if you execute power management
> off action, a signal is sent to OS to execute 
>
> regular shutdown so VMs are also shutted down properly.
>

I understand the reason, but is it really what the user expects?

I mean, if I set HA mode on a VM I'd expect the that the engine cares to
keep it up of restart if needed regardless of shutdown reasons.
For instance, on hosted-engine the HA agent, if not in global maintenance
mode, will restart the engine VM regardless of who or why it went off.



> 
>
>> We are aware of a similar issue on specific hw -
>> https://bugzilla.redhat.com/show_bug.cgi?id=1341106
>>
>> >>>
>> >>>  If I'm not mistaken, this means that VM was properly shutted down
>> from within itself and in that case it's not restarted automatically. So
>> I'm curious what actions have you made to make host KOM-AD01-VM31
>> non-responsive?
>> >>>
>> >>>  If you want to test fencing properly, then I suggest you to either
>> block connection between host and engine on host side and forcibly stop
>> ovirtmgmt network interface on host and watch fencing is applied.
>>
>
> Try above if you want to test fencing. Of course you can always configure
> firewall rule to drop all packets between engine and host or unplug host
> network cable.
>
> >>>
>> >>>  Martin
>> >>>
>> >>>  On Thu, Sep 15, 2016 at 1:16 PM,  wrote:
>>   engine.log for this period.
>> 
>>   15.09.2016, 14:01, "Martin Perina" :
>> >  On Thu, Sep 15, 2016 at 12:47 PM, 
>> wrote:
>> >>  Hi Martin.
>> >>  I have a stupid question. Use Watchdog device mandatory to
>> automatically start a virtual machine in host Fencing process?
>> >
>> >  AFAIK it's not, but I'm not na expert, adding Arik.
>> >
>> >  You need correct power management setup for the hosts and VM has
>> to be marked as highly available for sure.
>> >
>> >>  15.09.2016, 13:43, "Martin Perina" :
>> >>>  Hi,
>> >>>
>> >>>  could you please share whole engine.log?
>> >>>
>> >>>  Thanks
>> >>>
>> >>>  Martin Perina
>> >>>
>> >>>  On Thu, Sep 15, 2016 at 12:01 PM, 
>> wrote:
>>   Hello oVirt guru`s !
>> 
>>   I have oVirt Hosted Engine 4.0.3-1.el7.centos on two CentOS 7.2
>> hosts (HP ProLiant DL 360 G5) connected to shared FC SAN Storage.
>> 
>>   1. I configured Power Management for the Hosts (successfully
>> added Fencing Agent for iLO2 from my hosts)
>> 
>>   2. I created new VM (KOM-AD01-PBX02) and installed Guest OS
>> (Ubuntu Server 16.04 LTS) and oVirt Guest Agent
>>   (As described herein https://blog.it-kb.ru/2016/09/
>> 14/install-ovirt-4-0-part-2-about-data-center-iso-domain-log
>> ical-network-vlan-vm-settings-console-guest-agent-live-migration/)
>>  In VM settings on "High Availability" I turned on the option
>> "Highly Available" and change "Priority" to "High"
>> 
>>   3. Now I'm trying to check Hard-Fencing and power off my first
>> host (KOM-AD01-VM31) from his iLO (KOM-AD01-ILO31).
>> 
>>   Fencing successfully works and server is automatically turned
>> on, but my HA VM not started on second host (KOM-AD01-VM32).
>>

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-16 Thread Martin Perina

On Fri, Sep 16, 2016 at 9:26 AM, Michal Skrivanek <
michal.skriva...@redhat.com> wrote:

>
> > On 16 Sep 2016, at 08:29, aleksey.maksi...@it-kb.ru wrote:
> >
> > There are more ideas?
> >
> > 15.09.2016, 14:40, "aleksey.maksi...@it-kb.ru" <
> aleksey.maksi...@it-kb.ru>:
> >> Martin, I physically turned off the server through the iLO2. See
> screenshots.
> >> I did not touch Virtual Machine (KOM-AD01-PBX02) at the same time.
> >> The virtual machine has been turned on at the time when the host shut
> down.
> >>
> >> 15.09.2016, 14:27, "Martin Perina" :
> >>>  Hi,
> >>>
> >>>  I found out this in the log:
> >>>
> >>>  2016-09-15 12:02:04,661 INFO  [org.ovirt.engine.core.
> vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-6) [] VM
> '660bafca-e9c3-4191-99b4-295ff8553488'(KOM-AD01-PBX02) moved from 'Up'
> --> 'Down'
> >>>  2016-09-15 12:02:04,788 INFO  [org.ovirt.engine.core.dal.
> dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-6) []
> Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM
> KOM-AD01-PBX02 is down. Exit message: User shut down from within the guest
>
> since it shut down cleanly, can you please check the guest's logs to see
> what triggered the shutdown? In such cases it is considered a user
> requested shutdown and such VMs are not restarted automatically
>

That's exactly what I meant by my response. From the log it's obvious that
VM was shutdown properly, so engine will not restart it on a different.
host. Also on most modern hosts if you execute power management off action,
a signal is sent to OS to execute 

regular shutdown so VMs are also shutted down properly.


> We are aware of a similar issue on specific hw -
> https://bugzilla.redhat.com/show_bug.cgi?id=1341106
>
> >>>
> >>>  If I'm not mistaken, this means that VM was properly shutted down
> from within itself and in that case it's not restarted automatically. So
> I'm curious what actions have you made to make host KOM-AD01-VM31
> non-responsive?
> >>>
> >>>  If you want to test fencing properly, then I suggest you to either
> block connection between host and engine on host side and forcibly stop
> ovirtmgmt network interface on host and watch fencing is applied.
>

Try above if you want to test fencing. Of course you can always configure
firewall rule to drop all packets between engine and host or unplug host
network cable.

>>>
> >>>  Martin
> >>>
> >>>  On Thu, Sep 15, 2016 at 1:16 PM,  wrote:
>   engine.log for this period.
> 
>   15.09.2016, 14:01, "Martin Perina" :
> >  On Thu, Sep 15, 2016 at 12:47 PM, 
> wrote:
> >>  Hi Martin.
> >>  I have a stupid question. Use Watchdog device mandatory to
> automatically start a virtual machine in host Fencing process?
> >
> >  AFAIK it's not, but I'm not na expert, adding Arik.
> >
> >  You need correct power management setup for the hosts and VM has to
> be marked as highly available for sure.
> >
> >>  15.09.2016, 13:43, "Martin Perina" :
> >>>  Hi,
> >>>
> >>>  could you please share whole engine.log?
> >>>
> >>>  Thanks
> >>>
> >>>  Martin Perina
> >>>
> >>>  On Thu, Sep 15, 2016 at 12:01 PM, 
> wrote:
>   Hello oVirt guru`s !
> 
>   I have oVirt Hosted Engine 4.0.3-1.el7.centos on two CentOS 7.2
> hosts (HP ProLiant DL 360 G5) connected to shared FC SAN Storage.
> 
>   1. I configured Power Management for the Hosts (successfully
> added Fencing Agent for iLO2 from my hosts)
> 
>   2. I created new VM (KOM-AD01-PBX02) and installed Guest OS
> (Ubuntu Server 16.04 LTS) and oVirt Guest Agent
>   (As described herein https://blog.it-kb.ru/2016/09/
> 14/install-ovirt-4-0-part-2-about-data-center-iso-domain-
> logical-network-vlan-vm-settings-console-guest-agent-live-migration/)
>  In VM settings on "High Availability" I turned on the option
> "Highly Available" and change "Priority" to "High"
> 
>   3. Now I'm trying to check Hard-Fencing and power off my first
> host (KOM-AD01-VM31) from his iLO (KOM-AD01-ILO31).
> 
>   Fencing successfully works and server is automatically turned
> on, but my HA VM not started on second host (KOM-AD01-VM32).
> 
>   These events I see in the oVirt web console:
> 
>   Sep 15, 2016 12:08:13 PMHost KOM-AD01-VM31 power
> management was verified successfully.
>   Sep 15, 2016 12:08:13 PMStatus of host KOM-AD01-VM31 was
> set to Up.
>   Sep 15, 2016 12:08:05 PMExecuting power management
> status on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent
> ilo:KOM-AD01-ILO31.holding.com.
>   Sep 15, 2016 12:05:48 PMHost KOM-AD01-VM31 is rebooting.
>   Sep 15, 2016 12:05:48

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-16 Thread Michal Skrivanek

> On 16 Sep 2016, at 08:29, aleksey.maksi...@it-kb.ru wrote:
> 
> There are more ideas?
> 
> 15.09.2016, 14:40, "aleksey.maksi...@it-kb.ru" :
>> Martin, I physically turned off the server through the iLO2. See screenshots.
>> I did not touch Virtual Machine (KOM-AD01-PBX02) at the same time.
>> The virtual machine has been turned on at the time when the host shut down.
>> 
>> 15.09.2016, 14:27, "Martin Perina" :
>>>  Hi,
>>> 
>>>  I found out this in the log:
>>> 
>>>  2016-09-15 12:02:04,661 INFO  
>>> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] 
>>> (ForkJoinPool-1-worker-6) [] VM 
>>> '660bafca-e9c3-4191-99b4-295ff8553488'(KOM-AD01-PBX02) moved from 'Up' --> 
>>> 'Down'
>>>  2016-09-15 12:02:04,788 INFO  
>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
>>> (ForkJoinPool-1-worker-6) [] Correlation ID: null, Call Stack: null, Custom 
>>> Event ID: -1, Message: VM KOM-AD01-PBX02 is down. Exit message: User shut 
>>> down from within the guest

since it shut down cleanly, can you please check the guest's logs to see what 
triggered the shutdown? In such cases it is considered a user requested 
shutdown and such VMs are not restarted automatically
We are aware of a similar issue on specific hw - 
https://bugzilla.redhat.com/show_bug.cgi?id=1341106

>>> 
>>>  If I'm not mistaken, this means that VM was properly shutted down from 
>>> within itself and in that case it's not restarted automatically. So I'm 
>>> curious what actions have you made to make host KOM-AD01-VM31 
>>> non-responsive?
>>> 
>>>  If you want to test fencing properly, then I suggest you to either block 
>>> connection between host and engine on host side and forcibly stop ovirtmgmt 
>>> network interface on host and watch fencing is applied.
>>> 
>>>  Martin
>>> 
>>>  On Thu, Sep 15, 2016 at 1:16 PM,  wrote:
  engine.log for this period.

  15.09.2016, 14:01, "Martin Perina" :
>  On Thu, Sep 15, 2016 at 12:47 PM,  wrote:
>>  Hi Martin.
>>  I have a stupid question. Use Watchdog device mandatory to 
>> automatically start a virtual machine in host Fencing process?
> 
>  AFAIK it's not, but I'm not na expert, adding Arik.
> 
>  You need correct power management setup for the hosts and VM has to be 
> marked as highly available for sure.
> 
>>  15.09.2016, 13:43, "Martin Perina" :
>>>  Hi,
>>> 
>>>  could you please share whole engine.log?
>>> 
>>>  Thanks
>>> 
>>>  Martin Perina
>>> 
>>>  On Thu, Sep 15, 2016 at 12:01 PM,  wrote:
  Hello oVirt guru`s !

  I have oVirt Hosted Engine 4.0.3-1.el7.centos on two CentOS 7.2 hosts 
 (HP ProLiant DL 360 G5) connected to shared FC SAN Storage.

  1. I configured Power Management for the Hosts (successfully added 
 Fencing Agent for iLO2 from my hosts)

  2. I created new VM (KOM-AD01-PBX02) and installed Guest OS (Ubuntu 
 Server 16.04 LTS) and oVirt Guest Agent
  (As described herein 
 https://blog.it-kb.ru/2016/09/14/install-ovirt-4-0-part-2-about-data-center-iso-domain-logical-network-vlan-vm-settings-console-guest-agent-live-migration/)
 In VM settings on "High Availability" I turned on the option 
 "Highly Available" and change "Priority" to "High"

  3. Now I'm trying to check Hard-Fencing and power off my first host 
 (KOM-AD01-VM31) from his iLO (KOM-AD01-ILO31).

  Fencing successfully works and server is automatically turned on, but 
 my HA VM not started on second host (KOM-AD01-VM32).

  These events I see in the oVirt web console:

  Sep 15, 2016 12:08:13 PMHost KOM-AD01-VM31 power management 
 was verified successfully.
  Sep 15, 2016 12:08:13 PMStatus of host KOM-AD01-VM31 was set 
 to Up.
  Sep 15, 2016 12:08:05 PMExecuting power management status on 
 Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent 
 ilo:KOM-AD01-ILO31.holding.com.
  Sep 15, 2016 12:05:48 PMHost KOM-AD01-VM31 is rebooting.
  Sep 15, 2016 12:05:48 PMHost KOM-AD01-VM31 was started by 
 SYSTEM.
  Sep 15, 2016 12:05:48 PMPower management start of Host 
 KOM-AD01-VM31 succeeded.
  Sep 15, 2016 12:05:41 PMExecuting power management status on 
 Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent 
 ilo:KOM-AD01-ILO31.holding.com.
  Sep 15, 2016 12:05:19 PMExecuting power management start on 
 Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent 
 ilo:KOM-AD01-ILO31.holding.com.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-16 Thread aleksey . maksimov

There are more ideas?

15.09.2016, 14:40, "aleksey.maksi...@it-kb.ru" :
> Martin, I physically turned off the server through the iLO2. See screenshots.
> I did not touch Virtual Machine (KOM-AD01-PBX02) at the same time.
> The virtual machine has been turned on at the time when the host shut down.
>
> 15.09.2016, 14:27, "Martin Perina" :
>>  Hi,
>>
>>  I found out this in the log:
>>
>>  2016-09-15 12:02:04,661 INFO  
>> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] 
>> (ForkJoinPool-1-worker-6) [] VM 
>> '660bafca-e9c3-4191-99b4-295ff8553488'(KOM-AD01-PBX02) moved from 'Up' --> 
>> 'Down'
>>  2016-09-15 12:02:04,788 INFO  
>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
>> (ForkJoinPool-1-worker-6) [] Correlation ID: null, Call Stack: null, Custom 
>> Event ID: -1, Message: VM KOM-AD01-PBX02 is down. Exit message: User shut 
>> down from within the guest
>>
>>  If I'm not mistaken, this means that VM was properly shutted down from 
>> within itself and in that case it's not restarted automatically. So I'm 
>> curious what actions have you made to make host KOM-AD01-VM31 non-responsive?
>>
>>  If you want to test fencing properly, then I suggest you to either block 
>> connection between host and engine on host side and forcibly stop ovirtmgmt 
>> network interface on host and watch fencing is applied.
>>
>>  Martin
>>
>>  On Thu, Sep 15, 2016 at 1:16 PM,  wrote:
>>>  engine.log for this period.
>>>
>>>  15.09.2016, 14:01, "Martin Perina" :
  On Thu, Sep 15, 2016 at 12:47 PM,  wrote:
>  Hi Martin.
>  I have a stupid question. Use Watchdog device mandatory to automatically 
> start a virtual machine in host Fencing process?

  AFAIK it's not, but I'm not na expert, adding Arik.

  You need correct power management setup for the hosts and VM has to be 
 marked as highly available for sure.

>  15.09.2016, 13:43, "Martin Perina" :
>>  Hi,
>>
>>  could you please share whole engine.log?
>>
>>  Thanks
>>
>>  Martin Perina
>>
>>  On Thu, Sep 15, 2016 at 12:01 PM,  wrote:
>>>  Hello oVirt guru`s !
>>>
>>>  I have oVirt Hosted Engine 4.0.3-1.el7.centos on two CentOS 7.2 hosts 
>>> (HP ProLiant DL 360 G5) connected to shared FC SAN Storage.
>>>
>>>  1. I configured Power Management for the Hosts (successfully added 
>>> Fencing Agent for iLO2 from my hosts)
>>>
>>>  2. I created new VM (KOM-AD01-PBX02) and installed Guest OS (Ubuntu 
>>> Server 16.04 LTS) and oVirt Guest Agent
>>>  (As described herein 
>>> https://blog.it-kb.ru/2016/09/14/install-ovirt-4-0-part-2-about-data-center-iso-domain-logical-network-vlan-vm-settings-console-guest-agent-live-migration/)
>>>     In VM settings on "High Availability" I turned on the option 
>>> "Highly Available" and change "Priority" to "High"
>>>
>>>  3. Now I'm trying to check Hard-Fencing and power off my first host 
>>> (KOM-AD01-VM31) from his iLO (KOM-AD01-ILO31).
>>>
>>>  Fencing successfully works and server is automatically turned on, but 
>>> my HA VM not started on second host (KOM-AD01-VM32).
>>>
>>>  These events I see in the oVirt web console:
>>>
>>>  Sep 15, 2016 12:08:13 PM        Host KOM-AD01-VM31 power management 
>>> was verified successfully.
>>>  Sep 15, 2016 12:08:13 PM        Status of host KOM-AD01-VM31 was set 
>>> to Up.
>>>  Sep 15, 2016 12:08:05 PM        Executing power management status on 
>>> Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent 
>>> ilo:KOM-AD01-ILO31.holding.com.
>>>  Sep 15, 2016 12:05:48 PM        Host KOM-AD01-VM31 is rebooting.
>>>  Sep 15, 2016 12:05:48 PM        Host KOM-AD01-VM31 was started by 
>>> SYSTEM.
>>>  Sep 15, 2016 12:05:48 PM        Power management start of Host 
>>> KOM-AD01-VM31 succeeded.
>>>  Sep 15, 2016 12:05:41 PM        Executing power management status on 
>>> Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent 
>>> ilo:KOM-AD01-ILO31.holding.com.
>>>  Sep 15, 2016 12:05:19 PM        Executing power management start on 
>>> Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent 
>>> ilo:KOM-AD01-ILO31.holding.com.
>>>  Sep 15, 2016 12:05:19 PM        Power management start of Host 
>>> KOM-AD01-VM31 initiated.
>>>  Sep 15, 2016 12:05:19 PM        Auto fence for host KOM-AD01-VM31 was 
>>> started.
>>>  Sep 15, 2016 12:05:11 PM        Executing power management status on 
>>> Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent 
>>> ilo:KOM-AD01-ILO31.holding.com.
>>>  Sep 15, 2016 12:05:04 PM        Executing power management status on 
>>> Host KOM-AD01-VM31 using

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-15 Thread aleksey . maksimov

Martin, I physically turned off the server through the iLO2. See screenshots.
I did not touch Virtual Machine (KOM-AD01-PBX02) at the same time.
The virtual machine has been turned on at the time when the host shut down.

15.09.2016, 14:27, "Martin Perina" :
> Hi,
>
> I found out this in the log:
>
> 2016-09-15 12:02:04,661 INFO  
> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] 
> (ForkJoinPool-1-worker-6) [] VM 
> '660bafca-e9c3-4191-99b4-295ff8553488'(KOM-AD01-PBX02) moved from 'Up' --> 
> 'Down'
> 2016-09-15 12:02:04,788 INFO  
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
> (ForkJoinPool-1-worker-6) [] Correlation ID: null, Call Stack: null, Custom 
> Event ID: -1, Message: VM KOM-AD01-PBX02 is down. Exit message: User shut 
> down from within the guest
>
> If I'm not mistaken, this means that VM was properly shutted down from within 
> itself and in that case it's not restarted automatically. So I'm curious what 
> actions have you made to make host KOM-AD01-VM31 non-responsive?
>
> If you want to test fencing properly, then I suggest you to either block 
> connection between host and engine on host side and forcibly stop ovirtmgmt 
> network interface on host and watch fencing is applied.
>
> Martin
>
> On Thu, Sep 15, 2016 at 1:16 PM,  wrote:
>> engine.log for this period.
>>
>> 15.09.2016, 14:01, "Martin Perina" :
>>> On Thu, Sep 15, 2016 at 12:47 PM,  wrote:
 Hi Martin.
 I have a stupid question. Use Watchdog device mandatory to automatically 
 start a virtual machine in host Fencing process?
>>>
>>> AFAIK it's not, but I'm not na expert, adding Arik.
>>>
>>> You need correct power management setup for the hosts and VM has to be 
>>> marked as highly available for sure.
>>>
 15.09.2016, 13:43, "Martin Perina" :
> Hi,
>
> could you please share whole engine.log?
>
> Thanks
>
> Martin Perina
>
> On Thu, Sep 15, 2016 at 12:01 PM,  wrote:
>> Hello oVirt guru`s !
>>
>> I have oVirt Hosted Engine 4.0.3-1.el7.centos on two CentOS 7.2 hosts 
>> (HP ProLiant DL 360 G5) connected to shared FC SAN Storage.
>>
>> 1. I configured Power Management for the Hosts (successfully added 
>> Fencing Agent for iLO2 from my hosts)
>>
>> 2. I created new VM (KOM-AD01-PBX02) and installed Guest OS (Ubuntu 
>> Server 16.04 LTS) and oVirt Guest Agent
>> (As described herein 
>> https://blog.it-kb.ru/2016/09/14/install-ovirt-4-0-part-2-about-data-center-iso-domain-logical-network-vlan-vm-settings-console-guest-agent-live-migration/)
>>    In VM settings on "High Availability" I turned on the option "Highly 
>> Available" and change "Priority" to "High"
>>
>> 3. Now I'm trying to check Hard-Fencing and power off my first host 
>> (KOM-AD01-VM31) from his iLO (KOM-AD01-ILO31).
>>
>> Fencing successfully works and server is automatically turned on, but my 
>> HA VM not started on second host (KOM-AD01-VM32).
>>
>> These events I see in the oVirt web console:
>>
>> Sep 15, 2016 12:08:13 PM        Host KOM-AD01-VM31 power management was 
>> verified successfully.
>> Sep 15, 2016 12:08:13 PM        Status of host KOM-AD01-VM31 was set to 
>> Up.
>> Sep 15, 2016 12:08:05 PM        Executing power management status on 
>> Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent 
>> ilo:KOM-AD01-ILO31.holding.com.
>> Sep 15, 2016 12:05:48 PM        Host KOM-AD01-VM31 is rebooting.
>> Sep 15, 2016 12:05:48 PM        Host KOM-AD01-VM31 was started by SYSTEM.
>> Sep 15, 2016 12:05:48 PM        Power management start of Host 
>> KOM-AD01-VM31 succeeded.
>> Sep 15, 2016 12:05:41 PM        Executing power management status on 
>> Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent 
>> ilo:KOM-AD01-ILO31.holding.com.
>> Sep 15, 2016 12:05:19 PM        Executing power management start on Host 
>> KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent 
>> ilo:KOM-AD01-ILO31.holding.com.
>> Sep 15, 2016 12:05:19 PM        Power management start of Host 
>> KOM-AD01-VM31 initiated.
>> Sep 15, 2016 12:05:19 PM        Auto fence for host KOM-AD01-VM31 was 
>> started.
>> Sep 15, 2016 12:05:11 PM        Executing power management status on 
>> Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent 
>> ilo:KOM-AD01-ILO31.holding.com.
>> Sep 15, 2016 12:05:04 PM        Executing power management status on 
>> Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent 
>> ilo:KOM-AD01-ILO31.holding.com.
>> Sep 15, 2016 12:05:04 PM        Host KOM-AD01-VM31 is non responsive.
>> Sep 15, 2016 12:02:32 PM        Host KOM-AD01-VM31 is not responding. It 
>> will stay in

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-15 Thread Martin Perina

Hi,

I found out this in the log:

2016-09-15 12:02:04,661 INFO
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(ForkJoinPool-1-worker-6) [] VM
'660bafca-e9c3-4191-99b4-295ff8553488'(KOM-AD01-PBX02) moved from 'Up' -->
'Down'
2016-09-15 12:02:04,788 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(ForkJoinPool-1-worker-6) [] Correlation ID: null, Call Stack: null, Custom
Event ID: -1, Message: VM KOM-AD01-PBX02 is down. Exit message: User shut
down from within the guest

If I'm not mistaken, this means that VM was properly shutted down from
within itself and in that case it's not restarted automatically. So I'm
curious what actions have you made to make host KOM-AD01-VM31
non-responsive?

If you want to test fencing properly, then I suggest you to either block
connection between host and engine on host side and forcibly stop ovirtmgmt
network interface on host and watch fencing is applied.


Martin


On Thu, Sep 15, 2016 at 1:16 PM,  wrote:

> engine.log for this period.
>
> 15.09.2016, 14:01, "Martin Perina" :
> > On Thu, Sep 15, 2016 at 12:47 PM,  wrote:
> >> Hi Martin.
> >> I have a stupid question. Use Watchdog device mandatory to
> automatically start a virtual machine in host Fencing process?
> >
> > AFAIK it's not, but I'm not na expert, adding Arik.
> >
> > You need correct power management setup for the hosts and VM has to be
> marked as highly available for sure.
> >
> >> 15.09.2016, 13:43, "Martin Perina" :
> >>> Hi,
> >>>
> >>> could you please share whole engine.log?
> >>>
> >>> Thanks
> >>>
> >>> Martin Perina
> >>>
> >>> On Thu, Sep 15, 2016 at 12:01 PM,  wrote:
>  Hello oVirt guru`s !
> 
>  I have oVirt Hosted Engine 4.0.3-1.el7.centos on two CentOS 7.2 hosts
> (HP ProLiant DL 360 G5) connected to shared FC SAN Storage.
> 
>  1. I configured Power Management for the Hosts (successfully added
> Fencing Agent for iLO2 from my hosts)
> 
>  2. I created new VM (KOM-AD01-PBX02) and installed Guest OS (Ubuntu
> Server 16.04 LTS) and oVirt Guest Agent
>  (As described herein https://blog.it-kb.ru/2016/09/
> 14/install-ovirt-4-0-part-2-about-data-center-iso-domain-
> logical-network-vlan-vm-settings-console-guest-agent-live-migration/)
> In VM settings on "High Availability" I turned on the option
> "Highly Available" and change "Priority" to "High"
> 
>  3. Now I'm trying to check Hard-Fencing and power off my first host
> (KOM-AD01-VM31) from his iLO (KOM-AD01-ILO31).
> 
>  Fencing successfully works and server is automatically turned on, but
> my HA VM not started on second host (KOM-AD01-VM32).
> 
>  These events I see in the oVirt web console:
> 
>  Sep 15, 2016 12:08:13 PMHost KOM-AD01-VM31 power management
> was verified successfully.
>  Sep 15, 2016 12:08:13 PMStatus of host KOM-AD01-VM31 was set
> to Up.
>  Sep 15, 2016 12:08:05 PMExecuting power management status on
> Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
>  Sep 15, 2016 12:05:48 PMHost KOM-AD01-VM31 is rebooting.
>  Sep 15, 2016 12:05:48 PMHost KOM-AD01-VM31 was started by
> SYSTEM.
>  Sep 15, 2016 12:05:48 PMPower management start of Host
> KOM-AD01-VM31 succeeded.
>  Sep 15, 2016 12:05:41 PMExecuting power management status on
> Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
>  Sep 15, 2016 12:05:19 PMExecuting power management start on
> Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
>  Sep 15, 2016 12:05:19 PMPower management start of Host
> KOM-AD01-VM31 initiated.
>  Sep 15, 2016 12:05:19 PMAuto fence for host KOM-AD01-VM31 was
> started.
>  Sep 15, 2016 12:05:11 PMExecuting power management status on
> Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
>  Sep 15, 2016 12:05:04 PMExecuting power management status on
> Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
>  Sep 15, 2016 12:05:04 PMHost KOM-AD01-VM31 is non responsive.
>  Sep 15, 2016 12:02:32 PMHost KOM-AD01-VM31 is not responding.
> It will stay in Connecting state for a grace period of 60 seconds and after
> that an attempt to fence the host will be issued.
>  Sep 15, 2016 12:02:32 PMVDSM KOM-AD01-VM31 command failed:
> Heartbeat exeeded
>  Sep 15, 2016 12:02:04 PMVM KOM-AD01-PBX02 is down. Exit
> message: User shut down from within the guest
> 
>  What am I doing wrong? Why HA VM not start on a second host?
>  ___
>  Users

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-15 Thread Martin Perina

On Thu, Sep 15, 2016 at 12:47 PM,  wrote:

> Hi Martin.
> I have a stupid question. Use Watchdog device mandatory to automatically
> start a virtual machine in host Fencing process?
>

AFAIK it's not, but I'm not na expert, adding Arik.

You need correct power management setup for the hosts and VM has to be
marked as highly available for sure.


> 15.09.2016, 13:43, "Martin Perina" :
>
> Hi,
>
> could you please share whole engine.log?
>
> Thanks
>
> Martin Perina
>
>
> On Thu, Sep 15, 2016 at 12:01 PM,  wrote:
>
> Hello oVirt guru`s !
>
> I have oVirt Hosted Engine 4.0.3-1.el7.centos on two CentOS 7.2 hosts (HP
> ProLiant DL 360 G5) connected to shared FC SAN Storage.
>
> 1. I configured Power Management for the Hosts (successfully added Fencing
> Agent for iLO2 from my hosts)
>
> 2. I created new VM (KOM-AD01-PBX02) and installed Guest OS (Ubuntu Server
> 16.04 LTS) and oVirt Guest Agent
> (As described herein https://blog.it-kb.ru/2016/09/
> 14/install-ovirt-4-0-part-2-about-data-center-iso-domain-
> logical-network-vlan-vm-settings-console-guest-agent-live-migration/)
>In VM settings on "High Availability" I turned on the option "Highly
> Available" and change "Priority" to "High"
>
> 3. Now I'm trying to check Hard-Fencing and power off my first host
> (KOM-AD01-VM31) from his iLO (KOM-AD01-ILO31).
>
> Fencing successfully works and server is automatically turned on, but my
> HA VM not started on second host (KOM-AD01-VM32).
>
> These events I see in the oVirt web console:
>
> Sep 15, 2016 12:08:13 PMHost KOM-AD01-VM31 power management was
> verified successfully.
> Sep 15, 2016 12:08:13 PMStatus of host KOM-AD01-VM31 was set to Up.
> Sep 15, 2016 12:08:05 PMExecuting power management status on Host
> KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
> Sep 15, 2016 12:05:48 PMHost KOM-AD01-VM31 is rebooting.
> Sep 15, 2016 12:05:48 PMHost KOM-AD01-VM31 was started by SYSTEM.
> Sep 15, 2016 12:05:48 PMPower management start of Host
> KOM-AD01-VM31 succeeded.
> Sep 15, 2016 12:05:41 PMExecuting power management status on Host
> KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
> Sep 15, 2016 12:05:19 PMExecuting power management start on Host
> KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
> Sep 15, 2016 12:05:19 PMPower management start of Host
> KOM-AD01-VM31 initiated.
> Sep 15, 2016 12:05:19 PMAuto fence for host KOM-AD01-VM31 was
> started.
> Sep 15, 2016 12:05:11 PMExecuting power management status on Host
> KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
> Sep 15, 2016 12:05:04 PMExecuting power management status on Host
> KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
> Sep 15, 2016 12:05:04 PMHost KOM-AD01-VM31 is non responsive.
> Sep 15, 2016 12:02:32 PMHost KOM-AD01-VM31 is not responding. It
> will stay in Connecting state for a grace period of 60 seconds and after
> that an attempt to fence the host will be issued.
> Sep 15, 2016 12:02:32 PMVDSM KOM-AD01-VM31 command failed:
> Heartbeat exeeded
> Sep 15, 2016 12:02:04 PMVM KOM-AD01-PBX02 is down. Exit message:
> User shut down from within the guest
>
>
> What am I doing wrong? Why HA VM not start on a second host?
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-15 Thread aleksey . maksimov

Hi Martin.I have a stupid question. Use Watchdog device mandatory to automatically start a virtual machine in host Fencing process? 15.09.2016, 13:43, "Martin Perina" :Hi, could you please share whole engine.log? Thanks Martin Perina  On Thu, Sep 15, 2016 at 12:01 PM,  wrote:Hello oVirt guru`s !I have oVirt Hosted Engine 4.0.3-1.el7.centos on two CentOS 7.2 hosts (HP ProLiant DL 360 G5) connected to shared FC SAN Storage.1. I configured Power Management for the Hosts (successfully added Fencing Agent for iLO2 from my hosts)2. I created new VM (KOM-AD01-PBX02) and installed Guest OS (Ubuntu Server 16.04 LTS) and oVirt Guest Agent(As described herein https://blog.it-kb.ru/2016/09/14/install-ovirt-4-0-part-2-about-data-center-iso-domain-logical-network-vlan-vm-settings-console-guest-agent-live-migration/)   In VM settings on "High Availability" I turned on the option "Highly Available" and change "Priority" to "High"3. Now I'm trying to check Hard-Fencing and power off my first host (KOM-AD01-VM31) from his iLO (KOM-AD01-ILO31).Fencing successfully works and server is automatically turned on, but my HA VM not started on second host (KOM-AD01-VM32).These events I see in the oVirt web console:Sep 15, 2016 12:08:13 PM        Host KOM-AD01-VM31 power management was verified successfully.Sep 15, 2016 12:08:13 PM        Status of host KOM-AD01-VM31 was set to Up.Sep 15, 2016 12:08:05 PM        Executing power management status on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:KOM-AD01-ILO31.holding.com.Sep 15, 2016 12:05:48 PM        Host KOM-AD01-VM31 is rebooting.Sep 15, 2016 12:05:48 PM        Host KOM-AD01-VM31 was started by SYSTEM.Sep 15, 2016 12:05:48 PM        Power management start of Host KOM-AD01-VM31 succeeded.Sep 15, 2016 12:05:41 PM        Executing power management status on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:KOM-AD01-ILO31.holding.com.Sep 15, 2016 12:05:19 PM        Executing power management start on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:KOM-AD01-ILO31.holding.com.Sep 15, 2016 12:05:19 PM        Power management start of Host KOM-AD01-VM31 initiated.Sep 15, 2016 12:05:19 PM        Auto fence for host KOM-AD01-VM31 was started.Sep 15, 2016 12:05:11 PM        Executing power management status on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:KOM-AD01-ILO31.holding.com.Sep 15, 2016 12:05:04 PM        Executing power management status on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:KOM-AD01-ILO31.holding.com.Sep 15, 2016 12:05:04 PM        Host KOM-AD01-VM31 is non responsive.Sep 15, 2016 12:02:32 PM        Host KOM-AD01-VM31 is not responding. It will stay in Connecting state for a grace period of 60 seconds and after that an attempt to fence the host will be issued.Sep 15, 2016 12:02:32 PM        VDSM KOM-AD01-VM31 command failed: Heartbeat exeededSep 15, 2016 12:02:04 PM        VM KOM-AD01-PBX02 is down. Exit message: User shut down from within the guestWhat am I doing wrong? Why HA VM not start on a second host?___Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

2016-09-15 Thread Martin Perina

Hi,

could you please share whole engine.log?

Thanks

Martin Perina


On Thu, Sep 15, 2016 at 12:01 PM,  wrote:

> Hello oVirt guru`s !
>
> I have oVirt Hosted Engine 4.0.3-1.el7.centos on two CentOS 7.2 hosts (HP
> ProLiant DL 360 G5) connected to shared FC SAN Storage.
>
> 1. I configured Power Management for the Hosts (successfully added Fencing
> Agent for iLO2 from my hosts)
>
> 2. I created new VM (KOM-AD01-PBX02) and installed Guest OS (Ubuntu Server
> 16.04 LTS) and oVirt Guest Agent
> (As described herein https://blog.it-kb.ru/2016/09/
> 14/install-ovirt-4-0-part-2-about-data-center-iso-domain-
> logical-network-vlan-vm-settings-console-guest-agent-live-migration/)
>In VM settings on "High Availability" I turned on the option "Highly
> Available" and change "Priority" to "High"
>
> 3. Now I'm trying to check Hard-Fencing and power off my first host
> (KOM-AD01-VM31) from his iLO (KOM-AD01-ILO31).
>
> Fencing successfully works and server is automatically turned on, but my
> HA VM not started on second host (KOM-AD01-VM32).
>
> These events I see in the oVirt web console:
>
> Sep 15, 2016 12:08:13 PMHost KOM-AD01-VM31 power management was
> verified successfully.
> Sep 15, 2016 12:08:13 PMStatus of host KOM-AD01-VM31 was set to Up.
> Sep 15, 2016 12:08:05 PMExecuting power management status on Host
> KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
> Sep 15, 2016 12:05:48 PMHost KOM-AD01-VM31 is rebooting.
> Sep 15, 2016 12:05:48 PMHost KOM-AD01-VM31 was started by SYSTEM.
> Sep 15, 2016 12:05:48 PMPower management start of Host
> KOM-AD01-VM31 succeeded.
> Sep 15, 2016 12:05:41 PMExecuting power management status on Host
> KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
> Sep 15, 2016 12:05:19 PMExecuting power management start on Host
> KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
> Sep 15, 2016 12:05:19 PMPower management start of Host
> KOM-AD01-VM31 initiated.
> Sep 15, 2016 12:05:19 PMAuto fence for host KOM-AD01-VM31 was
> started.
> Sep 15, 2016 12:05:11 PMExecuting power management status on Host
> KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
> Sep 15, 2016 12:05:04 PMExecuting power management status on Host
> KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
> Sep 15, 2016 12:05:04 PMHost KOM-AD01-VM31 is non responsive.
> Sep 15, 2016 12:02:32 PMHost KOM-AD01-VM31 is not responding. It
> will stay in Connecting state for a grace period of 60 seconds and after
> that an attempt to fence the host will be issued.
> Sep 15, 2016 12:02:32 PMVDSM KOM-AD01-VM31 command failed:
> Heartbeat exeeded
> Sep 15, 2016 12:02:04 PMVM KOM-AD01-PBX02 is down. Exit message:
> User shut down from within the guest
>
>
> What am I doing wrong? Why HA VM not start on a second host?
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

24 matches

Site Navigation

Mail list logo

Footer information