Hi, Ok, I have posted the reverting patch: https://gerrit.ovirt.org/#/c/99845/
I'm still investigating what is the problem. Sorry for the delay, we had a public holiday yesturday. Andrej On Thu, 9 May 2019 at 11:20, Dafna Ron <[email protected]> wrote: > Hi, > > I have not heard back on this issue and ovirt-engine has been broken for > the past 3 days. > > As this does not seem a simple debug and fix I suggest reverting the patch > and investigating later. > > thanks, > Dafna > > > > On Wed, May 8, 2019 at 9:42 AM Dafna Ron <[email protected]> wrote: > >> Any news? >> >> Thanks, >> Dafna >> >> >> On Tue, May 7, 2019 at 4:57 PM Dafna Ron <[email protected]> wrote: >> >>> thanks for the quick reply and investigation. >>> Please update me if I can help any further and if you find the cause and >>> have a patch let me know. >>> Note that ovirt-engine project is broken and if we cannot find the cause >>> relatively fast we should consider reverting the patch to allow a new >>> package to be built in CQ with other changes that were submitted. >>> >>> Thanks, >>> Dafna >>> >>> >>> On Tue, May 7, 2019 at 4:42 PM Andrej Krejcir <[email protected]> >>> wrote: >>> >>>> After running a few OSTs manually, it seems that the patch is the >>>> cause. Investigating... >>>> >>>> On Tue, 7 May 2019 at 14:58, Andrej Krejcir <[email protected]> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> The issue is probably not caused by the patch. >>>>> >>>>> This log line means that the VM does not exist in the DB: >>>>> >>>>> 2019-05-07 06:02:04,215-04 WARN >>>>> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand] >>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] >>>>> Validation >>>>> of action 'MigrateMultipleVms' failed for user admin@internal-authz. >>>>> Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND >>>>> >>>>> I will investigate more, why the VM is missing. >>>>> >>>>> On Tue, 7 May 2019 at 14:07, Dafna Ron <[email protected]> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> We are failing test upgrade_hosts on >>>>>> upgrade-from-release-suite-master. >>>>>> From the logs I can see that we are calling migrate vm when we have >>>>>> only one host and the vm seem to have been shut down before the >>>>>> maintenance >>>>>> call is issued. >>>>>> >>>>>> Can you please look into this? >>>>>> >>>>>> suspected patch reported as root cause by CQ is: >>>>>> >>>>>> https://gerrit.ovirt.org/#/c/98920/ - core: Add MigrateMultipleVms >>>>>> command and use it for host maintenance >>>>>> >>>>>> >>>>>> logs are found here: >>>>>> >>>>>> >>>>>> >>>>>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14021/artifact/upgrade-from-release-suite.el7.x86_64/test_logs/upgrade-from-release-suite-master/post-004_basic_sanity.py/ >>>>>> >>>>>> >>>>>> I can see the issue is vm migration when putting host in maintenance: >>>>>> >>>>>> >>>>>> 2019-05-07 06:02:04,170-04 INFO >>>>>> [org.ovirt.engine.core.bll.MaintenanceVdsCommand] >>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) >>>>>> [05592db2-f859-487b-b779-4b32eec5bab >>>>>> 3] Running command: MaintenanceVdsCommand internal: true. Entities >>>>>> affected : ID: 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDS >>>>>> 2019-05-07 06:02:04,215-04 WARN >>>>>> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand] >>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] >>>>>> Validation >>>>>> of action >>>>>> 'MigrateMultipleVms' failed for user admin@internal-authz. Reasons: >>>>>> ACTION_TYPE_FAILED_VMS_NOT_FOUND >>>>>> 2019-05-07 06:02:04,221-04 ERROR >>>>>> [org.ovirt.engine.core.bll.MaintenanceVdsCommand] >>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Failed >>>>>> to >>>>>> migrate one or >>>>>> more VMs. >>>>>> 2019-05-07 06:02:04,227-04 ERROR >>>>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] >>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] EVEN >>>>>> T_ID: VDS_MAINTENANCE_FAILED(17), Failed to switch Host >>>>>> lago-upgrade-from-release-suite-master-host-0 to Maintenance mode. >>>>>> 2019-05-07 06:02:04,239-04 INFO >>>>>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock >>>>>> Acquired to object 'Eng >>>>>> ineLock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]', >>>>>> sharedLocks=''}' >>>>>> 2019-05-07 06:02:04,242-04 INFO >>>>>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Running >>>>>> command: ActivateVds >>>>>> Command internal: true. Entities affected : ID: >>>>>> 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDSAction group >>>>>> MANIPULATE_HOST >>>>>> with role type ADMIN >>>>>> 2019-05-07 06:02:04,243-04 INFO >>>>>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Before >>>>>> acquiring lock in ord >>>>>> er to prevent monitoring for host >>>>>> 'lago-upgrade-from-release-suite-master-host-0' from data-center >>>>>> 'test-dc' >>>>>> 2019-05-07 06:02:04,243-04 INFO >>>>>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock >>>>>> acquired, from now a mo >>>>>> nitoring of host will be skipped for host >>>>>> 'lago-upgrade-from-release-suite-master-host-0' from data-center >>>>>> 'test-dc' >>>>>> 2019-05-07 06:02:04,252-04 INFO >>>>>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] START, >>>>>> SetVdsStatu >>>>>> sVDSCommand(HostName = lago-upgrade-from-release-suite-master-host-0, >>>>>> SetVdsStatusVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9', >>>>>> status='Unassigned', n >>>>>> onOperationalReason='NONE', stopSpmFailureLogged='false', >>>>>> maintenanceReason='null'}), log id: 2c8aa211 >>>>>> 2019-05-07 06:02:04,256-04 INFO >>>>>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] FINISH, >>>>>> SetVdsStat >>>>>> usVDSCommand, return: , log id: 2c8aa211 >>>>>> 2019-05-07 06:02:04,261-04 INFO >>>>>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Activate >>>>>> host finished. Lock >>>>>> released. Monitoring can run now for host >>>>>> 'lago-upgrade-from-release-suite-master-host-0' from data-center >>>>>> 'test-dc' >>>>>> 2019-05-07 06:02:04,265-04 INFO >>>>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] >>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] EVEN >>>>>> T_ID: VDS_ACTIVATE(16), Activation of host >>>>>> lago-upgrade-from-release-suite-master-host-0 initiated by >>>>>> admin@internal-authz. >>>>>> 2019-05-07 06:02:04,266-04 INFO >>>>>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock >>>>>> freed >>>>>> to object 'Engine >>>>>> Lock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]', >>>>>> sharedLocks=''}' >>>>>> 2019-05-07 06:02:04,484-04 ERROR >>>>>> [org.ovirt.engine.core.bll.hostdeploy.HostUpgradeCallback] >>>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-96) >>>>>> [05592db2-f859-487b-b779-4b32 >>>>>> eec5bab3] Host 'lago-upgrade-from-release-suite-master-host-0' failed >>>>>> to move to maintenance mode. Upgrade process is terminated. >>>>>> >>>>>> I can see there was only one vm running: >>>>>> >>>>>> >>>>>> drwxrwxr-x. 2 dron dron 1024 May 7 11:49 qemu >>>>>> [dron@dron post-004_basic_sanity.py]$ ls -l >>>>>> lago-upgrade-from-release-suite-master-host-0/_var_log/libvirt/qemu/ >>>>>> total 6 >>>>>> -rw-rw-r--. 1 dron dron 4466 May 7 10:12 vm-with-iface.log >>>>>> >>>>>> and I can see that there was an attempt to terminate it with an error >>>>>> that it does not exist: >>>>>> >>>>>> >>>>>> stroyVmVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9', >>>>>> vmId='dfbd75e2-a9cb-4fca-8788-a16954db4abf', secondsToWait='0', >>>>>> gracefully='false', reason='', ig >>>>>> noreNoVm='false'}), log id: 24278e9b >>>>>> 2019-05-07 06:01:41,082-04 INFO >>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default >>>>>> task-1) [105f7555-517b-4bf9-b86e-6eb42375de20] START, DestroyVDSComma >>>>>> nd(HostName = lago-upgrade-from-release-suite-master-host-0, >>>>>> DestroyVmVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9', >>>>>> vmId='dfbd75e2-a9cb-4fca-8788-a169 >>>>>> 54db4abf', secondsToWait='0', gracefully='false', reason='', >>>>>> ignoreNoVm='false'}), log id: 78bba2f8 >>>>>> 2019-05-07 06:01:42,090-04 INFO >>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default >>>>>> task-1) [105f7555-517b-4bf9-b86e-6eb42375de20] FINISH, DestroyVDSComm >>>>>> and, return: , log id: 78bba2f8 >>>>>> 2019-05-07 06:01:42,090-04 INFO >>>>>> [org.ovirt.engine.core.vdsbroker.DestroyVmVDSCommand] (default task-1) >>>>>> [105f7555-517b-4bf9-b86e-6eb42375de20] FINISH, DestroyVmVDSCommand, r >>>>>> eturn: , log id: 24278e9b >>>>>> 2019-05-07 06:01:42,094-04 INFO >>>>>> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] >>>>>> (ForkJoinPool-1-worker-4) [] VM 'dfbd75e2-a9cb-4fca-8788-a16954db4abf' >>>>>> was >>>>>> reported >>>>>> as Down on VDS >>>>>> '38e1379b-c3b6-4a2e-91df-d1f346e414a9'(lago-upgrade-from-release-suite-master-host-0) >>>>>> 2019-05-07 06:01:42,096-04 INFO >>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] >>>>>> (ForkJoinPool-1-worker-4) [] START, DestroyVDSCommand(HostName = >>>>>> lago-upgrade- >>>>>> from-release-suite-master-host-0, >>>>>> DestroyVmVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9', >>>>>> vmId='dfbd75e2-a9cb-4fca-8788-a16954db4abf', secondsToWait='0 >>>>>> ', gracefully='false', reason='', ignoreNoVm='true'}), log id: >>>>>> 1dbd31eb >>>>>> 2019-05-07 06:01:42,114-04 INFO >>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] >>>>>> (ForkJoinPool-1-worker-4) [] Failed to destroy VM >>>>>> 'dfbd75e2-a9cb-4fca-8788-a16 >>>>>> 954db4abf' because VM does not exist, ignoring >>>>>> >>>>>> >>>>>> >>>>>>
_______________________________________________ Devel mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/3HWQQCNHCKQVHGSWPPXEFM56MLCJTNYM/
