Re: [ovirt-users] VM switched off when migration for host maintenance fails

2017-03-16 Thread Eduardo Mayoral
Hi, Michal, thank you for your interest.

Is it reproducible? 

No. This is actually the first time that I have seen this behaviour, and a 
colleague of mine is developing a script for automatic patching of the hosts, 
so we are putting them into maintenance very frequently in the last few days.


It sounds similar to https://bugzilla.redhat.com/show_bug.cgi?id=1426727 but 
it’s not exactly that
Can you confirm you’re not using QoS/Quota? 

Yes, I can confirm we are not using QoS / Quota. Quota is disabled for the 
datacenter and there are no QoS policies defined.

Does the VM fail the same way when you migrate it manually?

No, I just did it several times and it migrated just fine.

If not, can you please reproduce with Move to Maintenance flow and narrow it 
down - timeframe, exact VM, corresponding engine and vdsm and qemu logs?

I will try to do this, but it will take some time, I will get back with the 
results next week.

Best regards!

Eduardo Mayoral Jimeno (emayo...@arsys.es)
Administrador de sistemas. Departamento de Plataformas. Arsys internet.
+34 941 620 145 ext. 5153

On 16/03/17 20:00, Michal Skrivanek wrote:
> Hi,
> Is it reproducible?
> It sounds similar to https://bugzilla.redhat.com/show_bug.cgi?id=1426727 but 
> it’s not exactly that
> Can you confirm you’re not using QoS/Quota?
> Does the VM fail the same way when you migrate it manually?
> If not, can you please reproduce with Move to Maintenance flow and narrow it 
> down - timeframe, exact VM, corresponding engine and vdsm and qemu logs?
>
> Thanks,
> michal
>
>> On 16 Mar 2017, at 12:17, Eduardo Mayoral  wrote:
>>
>> OK, then,
>>
>> Please find the vdsm logs for both source and destination attached.
>>
>> Eduardo Mayoral Jimeno (emayo...@arsys.es
>> )
>> Administrador de sistemas. Departamento de Plataformas. Arsys internet.
>> +34 941 620 145 ext. 5153
>>
>> On 16/03/17 12:13, Yaniv Kaul wrote:
>>> Please share on the mailing list - I might not get to look at them.
>>> It's not too big.
>>> Y.
>>>
>>> On Thu, Mar 16, 2017 at 1:06 PM, Eduardo Mayoral  wrote:
>>> Sure! Please find the logs attached. I do not mind sharing them on the 
>>> mailing list, but I feel they are probably too big.
>>>  Eduardo Mayoral Jimeno (emayo...@arsys.es
>>> )
>>> Administrador de sistemas. Departamento de Plataformas. Arsys internet.
>>>
>>> +34 941 620 145 ext. 5153
>>> On 16/03/17 11:58, Yaniv Kaul wrote:

 On Thu, Mar 16, 2017 at 12:56 PM, Eduardo Mayoral  
 wrote:
 Hi,

 An interesting thing just happened on my oVirt deployment.

 While setting a host for maintenance, one of the VMs running on that
 host failed to migrate. Then ovirt-engine for some reason turned the VM
 off. Here are the relevant log lines from engine.log:




 2017-03-16 09:56:23,324Z INFO
 [org.ovirt.engine.core.bll.MigrateVmCommand] (default task-60)
 [61a52216] Lock Acquired to object
 'EngineLock:{exclusiveLocks='[422663a9-d712-4992-7c81-165b2976073e=]', sharedLocks='null'}'
 2017-03-16 09:56:23,957Z INFO
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
 (default task-60) [61a52216] EVENT_ID:
 VM_MIGRATION_START_SYSTEM_INITIATED(67), Correlation ID: 61a52216, Job
 ID: fbc5e0d7-4618-4ca8-b7a7-0fd9a43f490f, Call Stack: null, Custom Event
 ID: -1, Message: Migration initiated by system (VM:
 entorno127.arsysdesarrollo.lan, Source: llkk594.arsyslan.es,
 Destination: llkk593.arsyslan.es, Reason: Host preparing for maintenance).
 2017-03-16 09:56:26,818Z INFO
 [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
 (DefaultQuartzScheduler10) [7c6c403b-0a82-47aa-aeb4-57fbe06e20e1] VM
 '422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
 was unexpectedly detected as 'MigratingTo' on VDS
 '43f43ec5-e51d-400d-8569-261c98382e3a'(llkk593.arsyslan.es) (expected on
 '11a82467-afa4-4e4e-bd92-3383082d0a5e')
 2017-03-16 09:56:42,149Z INFO
 [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
 (DefaultQuartzScheduler10) [fac8ade9-a867-4ff7-aac5-912f78cc3bb5] VM
 '422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
 was unexpectedly detected as 'MigratingTo' on VDS
 '43f43ec5-e51d-400d-8569-261c98382e3a'(llkk593.arsyslan.es) (expected on
 '11a82467-afa4-4e4e-bd92-3383082d0a5e')
 2017-03-16 09:56:54,755Z INFO
 [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
 (DefaultQuartzScheduler3) [54fa9a1b-fddb-4812-b5d2-06cf92834709] VM
 '422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
 moved from 'MigratingFrom' --> 'Down'
 2017-03-16 09:56:54,755Z INFO
 [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
 (DefaultQuartzScheduler3) 

Re: [ovirt-users] VM switched off when migration for host maintenance fails

2017-03-16 Thread Michal Skrivanek
Hi,
Is it reproducible?
It sounds similar to https://bugzilla.redhat.com/show_bug.cgi?id=1426727 but 
it’s not exactly that
Can you confirm you’re not using QoS/Quota?
Does the VM fail the same way when you migrate it manually?
If not, can you please reproduce with Move to Maintenance flow and narrow it 
down - timeframe, exact VM, corresponding engine and vdsm and qemu logs?

Thanks,
michal

> On 16 Mar 2017, at 12:17, Eduardo Mayoral  wrote:
> 
> OK, then,
> 
> Please find the vdsm logs for both source and destination attached.
> 
> Eduardo Mayoral Jimeno (emayo...@arsys.es
> )
> Administrador de sistemas. Departamento de Plataformas. Arsys internet.
> +34 941 620 145 ext. 5153
> 
> On 16/03/17 12:13, Yaniv Kaul wrote:
>> Please share on the mailing list - I might not get to look at them.
>> It's not too big.
>> Y.
>> 
>> On Thu, Mar 16, 2017 at 1:06 PM, Eduardo Mayoral  wrote:
>> Sure! Please find the logs attached. I do not mind sharing them on the 
>> mailing list, but I feel they are probably too big.
>>  Eduardo Mayoral Jimeno (emayo...@arsys.es
>> )
>> Administrador de sistemas. Departamento de Plataformas. Arsys internet.
>> 
>> +34 941 620 145 ext. 5153
>> On 16/03/17 11:58, Yaniv Kaul wrote:
>>> 
>>> 
>>> On Thu, Mar 16, 2017 at 12:56 PM, Eduardo Mayoral  wrote:
>>> Hi,
>>> 
>>> An interesting thing just happened on my oVirt deployment.
>>> 
>>> While setting a host for maintenance, one of the VMs running on that
>>> host failed to migrate. Then ovirt-engine for some reason turned the VM
>>> off. Here are the relevant log lines from engine.log:
>>> 
>>> 
>>> 
>>> 
>>> 2017-03-16 09:56:23,324Z INFO
>>> [org.ovirt.engine.core.bll.MigrateVmCommand] (default task-60)
>>> [61a52216] Lock Acquired to object
>>> 'EngineLock:{exclusiveLocks='[422663a9-d712-4992-7c81-165b2976073e=>> ACTION_TYPE_FAILED_VM_IS_BEING_MIGRATED$VmName
>>> entorno127.arsysdesarrollo.lan>]', sharedLocks='null'}'
>>> 2017-03-16 09:56:23,957Z INFO
>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>> (default task-60) [61a52216] EVENT_ID:
>>> VM_MIGRATION_START_SYSTEM_INITIATED(67), Correlation ID: 61a52216, Job
>>> ID: fbc5e0d7-4618-4ca8-b7a7-0fd9a43f490f, Call Stack: null, Custom Event
>>> ID: -1, Message: Migration initiated by system (VM:
>>> entorno127.arsysdesarrollo.lan, Source: llkk594.arsyslan.es,
>>> Destination: llkk593.arsyslan.es, Reason: Host preparing for maintenance).
>>> 2017-03-16 09:56:26,818Z INFO
>>> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>>> (DefaultQuartzScheduler10) [7c6c403b-0a82-47aa-aeb4-57fbe06e20e1] VM
>>> '422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
>>> was unexpectedly detected as 'MigratingTo' on VDS
>>> '43f43ec5-e51d-400d-8569-261c98382e3a'(llkk593.arsyslan.es) (expected on
>>> '11a82467-afa4-4e4e-bd92-3383082d0a5e')
>>> 2017-03-16 09:56:42,149Z INFO
>>> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>>> (DefaultQuartzScheduler10) [fac8ade9-a867-4ff7-aac5-912f78cc3bb5] VM
>>> '422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
>>> was unexpectedly detected as 'MigratingTo' on VDS
>>> '43f43ec5-e51d-400d-8569-261c98382e3a'(llkk593.arsyslan.es) (expected on
>>> '11a82467-afa4-4e4e-bd92-3383082d0a5e')
>>> 2017-03-16 09:56:54,755Z INFO
>>> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>>> (DefaultQuartzScheduler3) [54fa9a1b-fddb-4812-b5d2-06cf92834709] VM
>>> '422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
>>> moved from 'MigratingFrom' --> 'Down'
>>> 2017-03-16 09:56:54,755Z INFO
>>> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>>> (DefaultQuartzScheduler3) [54fa9a1b-fddb-4812-b5d2-06cf92834709] Handing
>>> over VM
>>> '422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
>>> to Host '43f43ec5-e51d-400d-8569-261c98382e3a'. Setting VM to status
>>> 'MigratingTo'
>>> 2017-03-16 09:56:58,160Z INFO
>>> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>>> (DefaultQuartzScheduler4) [dd4ea7bb-85bc-410e-9a8f-357c28d85d1c] VM
>>> '422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
>>> is running in db and not running on VDS
>>> '43f43ec5-e51d-400d-8569-261c98382e3a'(llkk593.arsyslan.es)
>>> 2017-03-16 09:56:58,161Z INFO
>>> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>>> (DefaultQuartzScheduler4) [dd4ea7bb-85bc-410e-9a8f-357c28d85d1c] add VM
>>> '422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
>>> to rerun treatment
>>>   turn to maintenance while VMs are still running on it.(VM:
>>> entorno127.arsysdesarrollo.lan, Source: llkk594.arsyslan.es,
>>> Destination: llkk593.arsyslan.es).
>>> 2017-03-16 09:56:59,193Z INFO
>>> [org.ovirt.engine.core.bll.MigrateVmCommand]
>>> (org.ovirt.thread.pool-7-thread-27) [14d40a88] Lock freed to object
>>> 'EngineLock:{exclusiveLocks='[422663a9-d712-4992-7c81-165b2976073e=>> 

Re: [ovirt-users] VM switched off when migration for host maintenance fails

2017-03-16 Thread Yaniv Kaul
On Thu, Mar 16, 2017 at 12:56 PM, Eduardo Mayoral  wrote:

> Hi,
>
> An interesting thing just happened on my oVirt deployment.
>
> While setting a host for maintenance, one of the VMs running on that
> host failed to migrate. Then ovirt-engine for some reason turned the VM
> off. Here are the relevant log lines from engine.log:
>
>
>
>
> 2017-03-16 09:56:23,324Z INFO
> [org.ovirt.engine.core.bll.MigrateVmCommand] (default task-60)
> [61a52216] Lock Acquired to object
> 'EngineLock:{exclusiveLocks='[422663a9-d712-4992-7c81-165b2976073e= ACTION_TYPE_FAILED_VM_IS_BEING_MIGRATED$VmName
> entorno127.arsysdesarrollo.lan>]', sharedLocks='null'}'
> 2017-03-16 09:56:23,957Z INFO
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (default task-60) [61a52216] EVENT_ID:
> VM_MIGRATION_START_SYSTEM_INITIATED(67), Correlation ID: 61a52216, Job
> ID: fbc5e0d7-4618-4ca8-b7a7-0fd9a43f490f, Call Stack: null, Custom Event
> ID: -1, Message: Migration initiated by system (VM:
> entorno127.arsysdesarrollo.lan, Source: llkk594.arsyslan.es,
> Destination: llkk593.arsyslan.es, Reason: Host preparing for maintenance).
> 2017-03-16 09:56:26,818Z INFO
> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
> (DefaultQuartzScheduler10) [7c6c403b-0a82-47aa-aeb4-57fbe06e20e1] VM
> '422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
> was unexpectedly detected as 'MigratingTo' on VDS
> '43f43ec5-e51d-400d-8569-261c98382e3a'(llkk593.arsyslan.es) (expected on
> '11a82467-afa4-4e4e-bd92-3383082d0a5e')
> 2017-03-16 09:56:42,149Z INFO
> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
> (DefaultQuartzScheduler10) [fac8ade9-a867-4ff7-aac5-912f78cc3bb5] VM
> '422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
> was unexpectedly detected as 'MigratingTo' on VDS
> '43f43ec5-e51d-400d-8569-261c98382e3a'(llkk593.arsyslan.es) (expected on
> '11a82467-afa4-4e4e-bd92-3383082d0a5e')
> 2017-03-16 09:56:54,755Z INFO
> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
> (DefaultQuartzScheduler3) [54fa9a1b-fddb-4812-b5d2-06cf92834709] VM
> '422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
> moved from 'MigratingFrom' --> 'Down'
> 2017-03-16 09:56:54,755Z INFO
> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
> (DefaultQuartzScheduler3) [54fa9a1b-fddb-4812-b5d2-06cf92834709] Handing
> over VM
> '422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
> to Host '43f43ec5-e51d-400d-8569-261c98382e3a'. Setting VM to status
> 'MigratingTo'
> 2017-03-16 09:56:58,160Z INFO
> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
> (DefaultQuartzScheduler4) [dd4ea7bb-85bc-410e-9a8f-357c28d85d1c] VM
> '422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
> is running in db and not running on VDS
> '43f43ec5-e51d-400d-8569-261c98382e3a'(llkk593.arsyslan.es)
> 2017-03-16 09:56:58,161Z INFO
> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
> (DefaultQuartzScheduler4) [dd4ea7bb-85bc-410e-9a8f-357c28d85d1c] add VM
> '422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
> to rerun treatment
>   turn to maintenance while VMs are still running on it.(VM:
> entorno127.arsysdesarrollo.lan, Source: llkk594.arsyslan.es,
> Destination: llkk593.arsyslan.es).
> 2017-03-16 09:56:59,193Z INFO
> [org.ovirt.engine.core.bll.MigrateVmCommand]
> (org.ovirt.thread.pool-7-thread-27) [14d40a88] Lock freed to object
> 'EngineLock:{exclusiveLocks='[422663a9-d712-4992-7c81-165b2976073e= ACTION_TYPE_FAILED_VM_IS_BEING_MIGRATED$VmName
> entorno127.arsysdesarrollo.lan>]', sharedLocks='null'}'
>
> Now, I understand an VM migration may fail for a number of reasons, but
> in that case, shouldn't the VM keep running on the source host? I do not
> quite understand what happened here or how to avoid it in the future.
>

Can you share vdsm logs from both source and destination?
Y.


>
> Best regards,
>
> --
> Eduardo Mayoral Jimeno (emayo...@arsys.es)
> Administrador de sistemas. Departamento de Plataformas. Arsys internet.
> +34 941 620 145 ext. 5153
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] VM switched off when migration for host maintenance fails

2017-03-16 Thread Eduardo Mayoral
Hi,

An interesting thing just happened on my oVirt deployment.

While setting a host for maintenance, one of the VMs running on that
host failed to migrate. Then ovirt-engine for some reason turned the VM
off. Here are the relevant log lines from engine.log:




2017-03-16 09:56:23,324Z INFO 
[org.ovirt.engine.core.bll.MigrateVmCommand] (default task-60)
[61a52216] Lock Acquired to object
'EngineLock:{exclusiveLocks='[422663a9-d712-4992-7c81-165b2976073e=]', sharedLocks='null'}'
2017-03-16 09:56:23,957Z INFO 
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-60) [61a52216] EVENT_ID:
VM_MIGRATION_START_SYSTEM_INITIATED(67), Correlation ID: 61a52216, Job
ID: fbc5e0d7-4618-4ca8-b7a7-0fd9a43f490f, Call Stack: null, Custom Event
ID: -1, Message: Migration initiated by system (VM:
entorno127.arsysdesarrollo.lan, Source: llkk594.arsyslan.es,
Destination: llkk593.arsyslan.es, Reason: Host preparing for maintenance).
2017-03-16 09:56:26,818Z INFO 
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(DefaultQuartzScheduler10) [7c6c403b-0a82-47aa-aeb4-57fbe06e20e1] VM
'422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
was unexpectedly detected as 'MigratingTo' on VDS
'43f43ec5-e51d-400d-8569-261c98382e3a'(llkk593.arsyslan.es) (expected on
'11a82467-afa4-4e4e-bd92-3383082d0a5e')
2017-03-16 09:56:42,149Z INFO 
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(DefaultQuartzScheduler10) [fac8ade9-a867-4ff7-aac5-912f78cc3bb5] VM
'422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
was unexpectedly detected as 'MigratingTo' on VDS
'43f43ec5-e51d-400d-8569-261c98382e3a'(llkk593.arsyslan.es) (expected on
'11a82467-afa4-4e4e-bd92-3383082d0a5e')
2017-03-16 09:56:54,755Z INFO 
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(DefaultQuartzScheduler3) [54fa9a1b-fddb-4812-b5d2-06cf92834709] VM
'422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
moved from 'MigratingFrom' --> 'Down'
2017-03-16 09:56:54,755Z INFO 
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(DefaultQuartzScheduler3) [54fa9a1b-fddb-4812-b5d2-06cf92834709] Handing
over VM
'422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
to Host '43f43ec5-e51d-400d-8569-261c98382e3a'. Setting VM to status
'MigratingTo'
2017-03-16 09:56:58,160Z INFO 
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(DefaultQuartzScheduler4) [dd4ea7bb-85bc-410e-9a8f-357c28d85d1c] VM
'422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
is running in db and not running on VDS
'43f43ec5-e51d-400d-8569-261c98382e3a'(llkk593.arsyslan.es)
2017-03-16 09:56:58,161Z INFO 
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(DefaultQuartzScheduler4) [dd4ea7bb-85bc-410e-9a8f-357c28d85d1c] add VM
'422663a9-d712-4992-7c81-165b2976073e'(entorno127.arsysdesarrollo.lan)
to rerun treatment
  turn to maintenance while VMs are still running on it.(VM:
entorno127.arsysdesarrollo.lan, Source: llkk594.arsyslan.es,
Destination: llkk593.arsyslan.es).
2017-03-16 09:56:59,193Z INFO 
[org.ovirt.engine.core.bll.MigrateVmCommand]
(org.ovirt.thread.pool-7-thread-27) [14d40a88] Lock freed to object
'EngineLock:{exclusiveLocks='[422663a9-d712-4992-7c81-165b2976073e=]', sharedLocks='null'}'

Now, I understand an VM migration may fail for a number of reasons, but
in that case, shouldn't the VM keep running on the source host? I do not
quite understand what happened here or how to avoid it in the future.

Best regards,

-- 
Eduardo Mayoral Jimeno (emayo...@arsys.es)
Administrador de sistemas. Departamento de Plataformas. Arsys internet.
+34 941 620 145 ext. 5153

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users