Hello all, lately i witnessed multiple failures for add_master_storage_domain test, which were not related to changes themselves, nor any infra issue. One example can be found here [1]. After investigation with huge help of Milan, issue is that Host falls from up state to whatever-but-not-up suddenly.
1. add_storage_domain picks a random host that is in up state 2. meantime engine starts fence action for it, so probably something gone bad with the host; the fence action fails with: *[org.ovirt.engine.core.bll.pm.FenceProxyLocator] (EE-ManagedThreadFactory-engineScheduled-Thread-38) [6692895f] Can not run fence action on host 'lago-basic-suite-master-host-0', no suitable proxy host was found.* 3. test fails on not being able to attach the domain to non-up host: *[org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-1) [] Operation Failed: [Cannot add storage server connection when Host status is not up]* For better orientation in failed job's engine log [1], fence action for host fails at :46:12,842-04 engine learns it cannot connect storage to host at :46:16,105-04 The test itself add_master_storage_domain starts at ~ :46:13,753 (according to lago log). Could you please check this? Thanks [1] https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/15829 [2] https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/15829/artifact/basic-suite.el7.x86_64/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/engine.log
_______________________________________________ Devel mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/MX7YJC4GLCOQCWXCQJB7BWEVPE6QCKXD/
