On Sun, Mar 7, 2021 at 4:06 PM Yedidyah Bar David <[email protected]> wrote:
>
> On Sun, Mar 7, 2021 at 5:12 AM <[email protected]> wrote:
> >
> > Project: 
> > https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/
> > Build: 
> > https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1943/
>
> Failed during 
> -basic-suite-master.test-scenarios.test_004_basic_sanity.test_ha_recovery
> due to timeout. Can't tell exactly why. Now rebuilt, to see if it
> happens again. Basic suite did pass this point with (mostly) the same
> code, and there shouldn't be anything HE-specific in this test. FYI.

It did fail again:

https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1944/

Again, engine.log [1] has no 'VM_DOWN_ERROR'. Any clue?

[1] 
https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1944/artifact/exported-artifacts/test_logs/he-basic-suite-master/lago-he-basic-suite-master-engine/_var_log/ovirt-engine/engine.log

>
> Some more details:
>
> The function is:
>
> @order_by(_TEST_LIST)
> def test_ha_recovery(engine_api, get_ansible_host_for_vm):
>     engine = engine_api.system_service()
>     with engine_utils.wait_for_event(engine, [119, 9602, 506]):
>         # VM_DOWN_ERROR event(119)
>         # HA_VM_FAILED event event(9602)
>         # VDS_INITIATED_RUN_VM event(506)
>         ansible_host = get_ansible_host_for_vm(VM2_NAME)
>         pid = ansible_host.shell('pgrep -f qemu.*guest=vm2')['stdout'].strip()
>         ansible_host.shell('kill -KILL {}'.format(pid))
>
>     vm_service = test_utils.get_vm_service(engine, VM2_NAME)
>     assertions.assert_true_within_long(
>         lambda:
>         vm_service.get().status == types.VmStatus.UP
>     )
>     with engine_utils.wait_for_event(engine, 33): # USER_STOP_VM event
>         vm_service.stop()
>
> It failed inside the 'with' statement. Can't find any of the expected
> events in engine.log. In last successful I see all three of them.
>
> We do not actively log anything about running the pgrep command, but
> it's logged by ansible on the host, to /var/log/messages. I see it in
> last successful, but not in above failed run, in any of the two hosts.
> So most likely it didn't even get to run pgrep.
>
> IMO we probably want some more information logged in such a case, to
> allow further investigation.
>
> Best regards,
>
> > Build Number: 1943
> > Build Status:  Still Failing
> > Triggered By: Started by timer
> >
> > -------------------------------------
> > Changes Since Last Success:
> > -------------------------------------
> > Changes for Build #1939
> > [hbraha] ost: Check if user custom repos are used on VM
> >
> >
> > Changes for Build #1940
> > [hbraha] network: add check_repos fixture
> >
> >
> > Changes for Build #1941
> > [Andrej Cernek] network, pylint: Simplify pylintrc file
> >
> >
> > Changes for Build #1942
> > [Andrej Cernek] network, pylint: Simplify pylintrc file
> >
> >
> > Changes for Build #1943
> > [Andrej Cernek] network, pylint: Simplify pylintrc file
> >
> >
> >
> >
> > -----------------
> > Failed Tests:
> > -----------------
> > 1 tests failed.
> > FAILED:  
> > he-basic-suite-master.test-scenarios.test_004_basic_sanity.test_ha_recovery
> >
> > Error Message:
> > AssertionError: False != True after 600 seconds
> >
> > Stack Trace:
> > engine_api = <ovirtsdk4.Connection object at 0x7f408114c1d0>
> > get_ansible_host_for_vm = <function 
> > get_ansible_host_for_vm.<locals>.ansible_host_for at 0x7f4081171950>
> >
> >     @order_by(_TEST_LIST)
> >     def test_ha_recovery(engine_api, get_ansible_host_for_vm):
> >         engine = engine_api.system_service()
> >         with engine_utils.wait_for_event(engine, [119, 9602, 506]):
> >             # VM_DOWN_ERROR event(119)
> >             # HA_VM_FAILED event event(9602)
> >             # VDS_INITIATED_RUN_VM event(506)
> >             ansible_host = get_ansible_host_for_vm(VM2_NAME)
> >             pid = ansible_host.shell('pgrep -f 
> > qemu.*guest=vm2')['stdout'].strip()
> > >           ansible_host.shell('kill -KILL {}'.format(pid))
> >
> > ../he-basic-suite-master/test-scenarios/test_004_basic_sanity.py:1070:
> > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> > _ _
> > /usr/lib64/python3.6/contextlib.py:88: in __exit__
> >     next(self.gen)
> > ../ost_utils/ost_utils/engine_utils.py:42: in wait_for_event
> >     lambda:
> > ../ost_utils/ost_utils/assertions.py:96: in assert_true_within_long
> >     assert_equals_within_long(func, True, allowed_exceptions)
> > ../ost_utils/ost_utils/assertions.py:81: in assert_equals_within_long
> >     func, value, LONG_TIMEOUT, allowed_exceptions=allowed_exceptions
> > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> > _ _
> >
> > func = <function wait_for_event.<locals>.<lambda> at 0x7f40810d5bf8>
> > value = True, timeout = 600, allowed_exceptions = [], initial_wait = 0
> > error_message = 'False != True after 600 seconds'
> >
> >     def assert_equals_within(
> >         func, value, timeout, allowed_exceptions=None, initial_wait=10,
> >         error_message=None
> >     ):
> >         allowed_exceptions = allowed_exceptions or []
> >         res = '<no-result-obtained>'
> >         with _EggTimer(timeout) as timer:
> >             while not timer.elapsed():
> >                 try:
> >                     res = func()
> >                     if res == value:
> >                         return
> >                 except Exception as exc:
> >                     if _instance_of_any(exc, allowed_exceptions):
> >                         time.sleep(3)
> >                         continue
> >
> >                     LOGGER.exception("Unhandled exception in %s", func)
> >                     raise
> >
> >                 if initial_wait == 0:
> >                     time.sleep(3)
> >                 else:
> >                     time.sleep(initial_wait)
> >                     initial_wait = 0
> >         try:
> >             if error_message is None:
> >                 error_message = '%s != %s after %s seconds' % (res, value, 
> > timeout)
> > >           raise AssertionError(error_message)
> > E           AssertionError: False != True after 600 seconds
> >
> > ../ost_utils/ost_utils/assertions.py:59: AssertionError
>
>
>
> --
> Didi



-- 
Didi
_______________________________________________
Devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/BXUP5RXXT2LHQ6EKMOLPRQS55UDLX2QG/

Reply via email to