Hi,

Ok, I have posted the reverting patch: https://gerrit.ovirt.org/#/c/99845/

I'm still investigating what is the problem. Sorry for the delay, we had a
public holiday yesturday.


Andrej

On Thu, 9 May 2019 at 11:20, Dafna Ron <[email protected]> wrote:

> Hi,
>
> I have not heard back on this issue and ovirt-engine has been broken for
> the past 3 days.
>
> As this does not seem a simple debug and fix I suggest reverting the patch
> and investigating later.
>
> thanks,
> Dafna
>
>
>
> On Wed, May 8, 2019 at 9:42 AM Dafna Ron <[email protected]> wrote:
>
>> Any news?
>>
>> Thanks,
>> Dafna
>>
>>
>> On Tue, May 7, 2019 at 4:57 PM Dafna Ron <[email protected]> wrote:
>>
>>> thanks for the quick reply and investigation.
>>> Please update me if I can help any further and if you find the cause and
>>> have a patch let me know.
>>> Note that ovirt-engine project is broken and if we cannot find the cause
>>> relatively fast we should consider reverting the patch to allow a new
>>> package to be built in CQ with other changes that were submitted.
>>>
>>> Thanks,
>>> Dafna
>>>
>>>
>>> On Tue, May 7, 2019 at 4:42 PM Andrej Krejcir <[email protected]>
>>> wrote:
>>>
>>>> After running a few OSTs manually, it seems that the patch is the
>>>> cause. Investigating...
>>>>
>>>> On Tue, 7 May 2019 at 14:58, Andrej Krejcir <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> The issue is probably not caused by the patch.
>>>>>
>>>>> This log line means that the VM does not exist in the DB:
>>>>>
>>>>> 2019-05-07 06:02:04,215-04 WARN
>>>>> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
>>>>> Validation
>>>>> of action 'MigrateMultipleVms' failed for user admin@internal-authz.
>>>>> Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND
>>>>>
>>>>> I will investigate more, why the VM is missing.
>>>>>
>>>>> On Tue, 7 May 2019 at 14:07, Dafna Ron <[email protected]> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> We are failing test upgrade_hosts on
>>>>>> upgrade-from-release-suite-master.
>>>>>> From the logs I can see that we are calling migrate vm when we have
>>>>>> only one host and the vm seem to have been shut down before the 
>>>>>> maintenance
>>>>>> call is issued.
>>>>>>
>>>>>> Can you please look into this?
>>>>>>
>>>>>> suspected patch reported as root cause by CQ is:
>>>>>>
>>>>>> https://gerrit.ovirt.org/#/c/98920/ - core: Add MigrateMultipleVms
>>>>>> command and use it for host maintenance
>>>>>>
>>>>>>
>>>>>> logs are found here:
>>>>>>
>>>>>>
>>>>>>
>>>>>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14021/artifact/upgrade-from-release-suite.el7.x86_64/test_logs/upgrade-from-release-suite-master/post-004_basic_sanity.py/
>>>>>>
>>>>>>
>>>>>> I can see the issue is vm migration when putting host in maintenance:
>>>>>>
>>>>>>
>>>>>> 2019-05-07 06:02:04,170-04 INFO
>>>>>> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2)
>>>>>> [05592db2-f859-487b-b779-4b32eec5bab
>>>>>> 3] Running command: MaintenanceVdsCommand internal: true. Entities
>>>>>> affected : ID: 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDS
>>>>>> 2019-05-07 06:02:04,215-04 WARN
>>>>>> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
>>>>>> Validation
>>>>>> of action
>>>>>> 'MigrateMultipleVms' failed for user admin@internal-authz. Reasons:
>>>>>> ACTION_TYPE_FAILED_VMS_NOT_FOUND
>>>>>> 2019-05-07 06:02:04,221-04 ERROR
>>>>>> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Failed 
>>>>>> to
>>>>>> migrate one or
>>>>>> more VMs.
>>>>>> 2019-05-07 06:02:04,227-04 ERROR
>>>>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] EVEN
>>>>>> T_ID: VDS_MAINTENANCE_FAILED(17), Failed to switch Host
>>>>>> lago-upgrade-from-release-suite-master-host-0 to Maintenance mode.
>>>>>> 2019-05-07 06:02:04,239-04 INFO
>>>>>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
>>>>>> Acquired to object 'Eng
>>>>>> ineLock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]',
>>>>>> sharedLocks=''}'
>>>>>> 2019-05-07 06:02:04,242-04 INFO
>>>>>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Running
>>>>>> command: ActivateVds
>>>>>> Command internal: true. Entities affected : ID:
>>>>>> 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDSAction group 
>>>>>> MANIPULATE_HOST
>>>>>> with role type ADMIN
>>>>>> 2019-05-07 06:02:04,243-04 INFO
>>>>>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Before
>>>>>> acquiring lock in ord
>>>>>> er to prevent monitoring for host
>>>>>> 'lago-upgrade-from-release-suite-master-host-0' from data-center 
>>>>>> 'test-dc'
>>>>>> 2019-05-07 06:02:04,243-04 INFO
>>>>>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
>>>>>> acquired, from now a mo
>>>>>> nitoring of host will be skipped for host
>>>>>> 'lago-upgrade-from-release-suite-master-host-0' from data-center 
>>>>>> 'test-dc'
>>>>>> 2019-05-07 06:02:04,252-04 INFO
>>>>>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] START,
>>>>>> SetVdsStatu
>>>>>> sVDSCommand(HostName = lago-upgrade-from-release-suite-master-host-0,
>>>>>> SetVdsStatusVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9',
>>>>>> status='Unassigned', n
>>>>>> onOperationalReason='NONE', stopSpmFailureLogged='false',
>>>>>> maintenanceReason='null'}), log id: 2c8aa211
>>>>>> 2019-05-07 06:02:04,256-04 INFO
>>>>>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] FINISH,
>>>>>> SetVdsStat
>>>>>> usVDSCommand, return: , log id: 2c8aa211
>>>>>> 2019-05-07 06:02:04,261-04 INFO
>>>>>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Activate
>>>>>> host finished. Lock
>>>>>> released. Monitoring can run now for host
>>>>>> 'lago-upgrade-from-release-suite-master-host-0' from data-center 
>>>>>> 'test-dc'
>>>>>> 2019-05-07 06:02:04,265-04 INFO
>>>>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] EVEN
>>>>>> T_ID: VDS_ACTIVATE(16), Activation of host
>>>>>> lago-upgrade-from-release-suite-master-host-0 initiated by
>>>>>> admin@internal-authz.
>>>>>> 2019-05-07 06:02:04,266-04 INFO
>>>>>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock 
>>>>>> freed
>>>>>> to object 'Engine
>>>>>> Lock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]',
>>>>>> sharedLocks=''}'
>>>>>> 2019-05-07 06:02:04,484-04 ERROR
>>>>>> [org.ovirt.engine.core.bll.hostdeploy.HostUpgradeCallback]
>>>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-96)
>>>>>> [05592db2-f859-487b-b779-4b32
>>>>>> eec5bab3] Host 'lago-upgrade-from-release-suite-master-host-0' failed
>>>>>> to move to maintenance mode. Upgrade process is terminated.
>>>>>>
>>>>>> I can see there was only one vm running:
>>>>>>
>>>>>>
>>>>>> drwxrwxr-x. 2 dron dron 1024 May 7 11:49 qemu
>>>>>> [dron@dron post-004_basic_sanity.py]$ ls -l
>>>>>> lago-upgrade-from-release-suite-master-host-0/_var_log/libvirt/qemu/
>>>>>> total 6
>>>>>> -rw-rw-r--. 1 dron dron 4466 May 7 10:12 vm-with-iface.log
>>>>>>
>>>>>> and I can see that there was an attempt to terminate it with an error
>>>>>> that it does not exist:
>>>>>>
>>>>>>
>>>>>> stroyVmVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9',
>>>>>> vmId='dfbd75e2-a9cb-4fca-8788-a16954db4abf', secondsToWait='0',
>>>>>> gracefully='false', reason='', ig
>>>>>> noreNoVm='false'}), log id: 24278e9b
>>>>>> 2019-05-07 06:01:41,082-04 INFO
>>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default
>>>>>> task-1) [105f7555-517b-4bf9-b86e-6eb42375de20] START, DestroyVDSComma
>>>>>> nd(HostName = lago-upgrade-from-release-suite-master-host-0,
>>>>>> DestroyVmVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9',
>>>>>> vmId='dfbd75e2-a9cb-4fca-8788-a169
>>>>>> 54db4abf', secondsToWait='0', gracefully='false', reason='',
>>>>>> ignoreNoVm='false'}), log id: 78bba2f8
>>>>>> 2019-05-07 06:01:42,090-04 INFO
>>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default
>>>>>> task-1) [105f7555-517b-4bf9-b86e-6eb42375de20] FINISH, DestroyVDSComm
>>>>>> and, return: , log id: 78bba2f8
>>>>>> 2019-05-07 06:01:42,090-04 INFO
>>>>>> [org.ovirt.engine.core.vdsbroker.DestroyVmVDSCommand] (default task-1)
>>>>>> [105f7555-517b-4bf9-b86e-6eb42375de20] FINISH, DestroyVmVDSCommand, r
>>>>>> eturn: , log id: 24278e9b
>>>>>> 2019-05-07 06:01:42,094-04 INFO
>>>>>> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
>>>>>> (ForkJoinPool-1-worker-4) [] VM 'dfbd75e2-a9cb-4fca-8788-a16954db4abf' 
>>>>>> was
>>>>>> reported
>>>>>> as Down on VDS
>>>>>> '38e1379b-c3b6-4a2e-91df-d1f346e414a9'(lago-upgrade-from-release-suite-master-host-0)
>>>>>> 2019-05-07 06:01:42,096-04 INFO
>>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
>>>>>> (ForkJoinPool-1-worker-4) [] START, DestroyVDSCommand(HostName =
>>>>>> lago-upgrade-
>>>>>> from-release-suite-master-host-0,
>>>>>> DestroyVmVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9',
>>>>>> vmId='dfbd75e2-a9cb-4fca-8788-a16954db4abf', secondsToWait='0
>>>>>> ', gracefully='false', reason='', ignoreNoVm='true'}), log id:
>>>>>> 1dbd31eb
>>>>>> 2019-05-07 06:01:42,114-04 INFO
>>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
>>>>>> (ForkJoinPool-1-worker-4) [] Failed to destroy VM
>>>>>> 'dfbd75e2-a9cb-4fca-8788-a16
>>>>>> 954db4abf' because VM does not exist, ignoring
>>>>>>
>>>>>>
>>>>>>
>>>>>>
_______________________________________________
Devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/3HWQQCNHCKQVHGSWPPXEFM56MLCJTNYM/

Reply via email to