On Mon, Mar 25, 2019 at 12:20 PM Greg Sheremeta <[email protected]> wrote:
> Not related to ui-extensions (which is very frontend only, Javascript > project) > I know its not, but this is just to surface that these failures needs to be investigated in depth by a relevant team in development to find root cause, something that isn't being done today enough IMHO, which leads to OST being broken for longer times. > > <testcase classname="002_bootstrap" name="add_master_storage_domain" > time="0.514"> > <error type="exceptions.RuntimeError" *message="Could not find hosts that > are up in DC test-dc* > > -------------------- >> begin captured logging << > -------------------- lago.ssh: DEBUG: start > task:ffc0094a-1134-4072-b8d7-3f2ea75eea7f:Get ssh client for > lago-basic-suite-4-3-engine: lago.ssh: DEBUG: end > task:ffc0094a-1134-4072-b8d7-3f2ea75eea7f:Get ssh client for > lago-basic-suite-4-3-engine: lago.ssh: DEBUG: Ru > > > On Mon, Mar 25, 2019, 5:15 AM Eyal Edri <[email protected]> wrote: > >> Still fails, now on a different component. ( ovirt-web-ui-extentions ) >> >> https://jenkins.ovirt.org/job/ovirt-4.3_change-queue-tester/339/ >> >> On Fri, Mar 22, 2019 at 3:59 PM Dan Kenigsberg <[email protected]> wrote: >> >>> >>> >>> On Fri, Mar 22, 2019 at 3:21 PM Marcin Sobczyk <[email protected]> >>> wrote: >>> >>>> Dafna, >>>> >>>> in 'verify_add_hosts' we specifically wait for single host to be up >>>> with a timeout: >>>> >>>> 144 up_hosts = hosts_service.list(search='datacenter={} AND >>>> status=up'.format(DC_NAME)) >>>> 145 if len(up_hosts): >>>> 146 return True >>>> >>>> The log files say, that it took ~50 secs for one of the hosts to be up >>>> (seems reasonable) and no timeout is being reported. >>>> Just after running 'verify_add_hosts', we run >>>> 'add_master_storage_domain', which calls '_hosts_in_dc' function. >>>> That function does the exact same check, but it fails: >>>> >>>> 113 hosts = hosts_service.list(search='datacenter={} AND >>>> status=up'.format(dc_name)) >>>> 114 if hosts: >>>> 115 if random_host: >>>> 116 return random.choice(hosts) >>>> >>>> I don't think it is relevant to our current failure; but I consider >>> random_host=True as a bad practice. As if we do not have enough moving >>> parts, we are adding intentional randomness. Reproducibility is far more >>> important than coverage - particularly for a shared system test like OST. >>> >>>> 117 else: >>>> 118 return sorted(hosts, key=lambda host: host.name) >>>> 119 raise RuntimeError('Could not find hosts that are up in DC %s' % >>>> dc_name) >>>> >>>> >>>> I'm also not able to reproduce this issue locally on my server. The >>>> investigation continues... >>>> >>> >>> I think that it would be fair to take the filtering by host state out of >>> Engine and into the test, where we can easily log the current status of >>> each host. Then we'd have better understanding on the next failure. >>> >>> On 3/22/19 1:17 PM, Marcin Sobczyk wrote: >>>> >>>> Hi, >>>> >>>> sure, I'm on it - it's weird though, I did ran 4.3 basic suite for this >>>> patch manually and everything was ok. >>>> On 3/22/19 1:05 PM, Dafna Ron wrote: >>>> >>>> Hi, >>>> >>>> We are failing branch 4.3 for test: >>>> 002_bootstrap.add_master_storage_domain >>>> >>>> It seems that in one of the hosts, the vdsm is not starting >>>> there is nothing in vdsm.log or in supervdsm.log >>>> >>>> CQ identified this patch as the suspected root cause: >>>> >>>> https://gerrit.ovirt.org/#/c/98748/ - vdsm: client: Add support for >>>> flow id >>>> >>>> Milan, Marcin, can you please have a look? >>>> >>>> full logs: >>>> >>>> >>>> http://jenkins.ovirt.org/job/ovirt-4.3_change-queue-tester/326/artifact/basic-suite.el7.x86_64/test_logs/basic-suite-4.3/post-002_bootstrap.py/ >>>> >>>> the only error I can see is about host not being up (makes sense as >>>> vdsm is not running) >>>> >>>> Stacktrace >>>> >>>> File "/usr/lib64/python2.7/unittest/case.py", line 369, in run >>>> testMethod() >>>> File "/usr/lib/python2.7/site-packages/nose/case.py", line 197, in >>>> runTest >>>> self.test(*self.arg) >>>> File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 142, >>>> in wrapped_test >>>> test() >>>> File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 60, >>>> in wrapper >>>> return func(get_test_prefix(), *args, **kwargs) >>>> File >>>> "/home/jenkins/workspace/ovirt-4.3_change-queue-tester/ovirt-system-tests/basic-suite-4.3/test-scenarios/002_bootstrap.py", >>>> line 417, in add_master_storage_domain >>>> add_iscsi_storage_domain(prefix) >>>> File >>>> "/home/jenkins/workspace/ovirt-4.3_change-queue-tester/ovirt-system-tests/basic-suite-4.3/test-scenarios/002_bootstrap.py", >>>> line 561, in add_iscsi_storage_domain >>>> host=_random_host_from_dc(api, DC_NAME), >>>> File >>>> "/home/jenkins/workspace/ovirt-4.3_change-queue-tester/ovirt-system-tests/basic-suite-4.3/test-scenarios/002_bootstrap.py", >>>> line 122, in _random_host_from_dc >>>> return _hosts_in_dc(api, dc_name, True) >>>> File >>>> "/home/jenkins/workspace/ovirt-4.3_change-queue-tester/ovirt-system-tests/basic-suite-4.3/test-scenarios/002_bootstrap.py", >>>> line 119, in _hosts_in_dc >>>> raise RuntimeError('Could not find hosts that are up in DC %s' % >>>> dc_name) >>>> 'Could not find hosts that are up in DC test-dc\n-------------------- >> >>>> begin captured logging << --------------------\nlago.ssh: DEBUG: start >>>> task:937bdea7-a2a3-47ad-9383-36647ea37ddf:Get ssh client for >>>> lago-basic-suite-4-3-engine:\nlago.ssh: DEBUG: end >>>> task:937bdea7-a2a3-47ad-9383-36647ea37ddf:Get ssh client for >>>> lago-basic-suite-4-3-engine:\nlago.ssh: DEBUG: Running c07b5ee2 on >>>> lago-basic-suite-4-3-engine: cat /root/multipath.txt\nlago.ssh: DEBUG: >>>> Command c07b5ee2 on lago-basic-suite-4-3-engine returned with 0\nlago.ssh: >>>> DEBUG: Command c07b5ee2 on lago-basic-suite-4-3-engine output:\n >>>> 3600140516f88cafa71243648ea218995\n360014053e28f60001764fed9978ec4b3\n360014059edc777770114a6484891dcf1\n36001405d93d8585a50d43a4ad0bd8d19\n36001405e31361631de14bcf87d43e55a\n\n----------- >>>> >>>> _______________________________________________ >>>> Devel mailing list -- [email protected] >>>> To unsubscribe send an email to [email protected] >>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>> oVirt Code of Conduct: >>>> https://www.ovirt.org/community/about/community-guidelines/ >>>> List Archives: >>>> https://lists.ovirt.org/archives/list/[email protected]/message/J4NCHXTK5ZYLXWW36DZKAUL5DN7WBNW4/ >>>> >>> _______________________________________________ >>> Devel mailing list -- [email protected] >>> To unsubscribe send an email to [email protected] >>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>> oVirt Code of Conduct: >>> https://www.ovirt.org/community/about/community-guidelines/ >>> List Archives: >>> https://lists.ovirt.org/archives/list/[email protected]/message/ULS4OKU2YZFDQT5EDFYKLW5GFA52YZ7U/ >>> >> >> >> -- >> >> Eyal edri >> >> >> MANAGER >> >> RHV/CNV DevOps >> >> EMEA VIRTUALIZATION R&D >> >> >> Red Hat EMEA <https://www.redhat.com/> >> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> >> phone: +972-9-7692018 >> irc: eedri (on #tlv #rhev-dev #rhev-integ) >> _______________________________________________ >> Devel mailing list -- [email protected] >> To unsubscribe send an email to [email protected] >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> oVirt Code of Conduct: >> https://www.ovirt.org/community/about/community-guidelines/ >> List Archives: >> https://lists.ovirt.org/archives/list/[email protected]/message/EM7QDDNG523F5LLEOGNHBK35BUY6J2ES/ >> > -- Eyal edri MANAGER RHV/CNV DevOps EMEA VIRTUALIZATION R&D Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)
_______________________________________________ Devel mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/GGIU4R2PAQVCAUFTET5ZKWGYICNJWUYN/
