On Tue, Dec 17, 2019 at 10:11 AM Yedidyah Bar David <[email protected]> wrote: > > Hi all, > > $subject. [1] has > ovirt-engine-4.4.0-0.0.master.20191204120550.git04d5d05.el7.noarch . > > Tried to look around, and I have a few notes/questions: > > 1. Last successful run of [2] is 3 days old, but apparently it wasn't > published. Any idea why? > > 2. Failed runs of [2] are reported to infra, with emails such as: > > [CQ]: 105472, 5 (ovirt-engine) failed "ovirt-master" system tests, but > isn't the failure root cause > > Is anyone monitoring these? > > Is this the only alerting that CI generates on such failures? > > If first is No and second is Yes, then we need someone/something to > start monitoring. This was discussed a lot, but I do not see any > change. Ideally, such alerts should be To'ed or Cc'ed to the author > and reviewers of the patch that CI found to be guilty (which might be > wrong, that's not the point). Do we plan to have something like this? > Any idea when it will be ready? > > 3. I looked at a few recent failures of [2], specifically [3][4]. Both > seem to have been killed after a timeout, while running > 'engine-config'. For [3] that's clear, see [5]: > > 2019-12-16 17:11:44,766::log_utils.py::__exit__::611::lago.ssh::DEBUG::end > task:fb6611dc-55bb-4251-aeda-2578b2ec83a2:Get ssh client for > lago-basic-suite-master-engine: > 2019-12-16 17:11:44,931::ssh.py::ssh::58::lago.ssh::DEBUG::Running > 22e2b6b6 on lago-basic-suite-master-engine: engine-config --set > VdsmUseNmstate=true > 2019-12-16 19:55:21,965::cmd.py::exit_handler::921::cli::DEBUG::signal > 15 was caught > > Can't find stdout/stderr of engine-config, so it's hard to tell if it > outputted anything helpful to understand why it was stuck. > > It's hard to tell that about [4], because it has very few artifacts > collected, no idea why, notably no lago.log, but [6] does show: > > [36m # initialize_engine: [32mSuccess [0m (in 0:04:00) [0m > [36m # engine_config: [0m [0m [0m > [36m * Collect artifacts: [0m [0m [0m > [36m - [Thread-34] lago-basic-suite-master-engine: > [31mERROR [0m (in 0:00:04) [0m > [36m * Collect artifacts: [31mERROR [0m (in 0:00:04) [0m > [36m # engine_config: [31mERROR [0m (in 2:42:57) [0m > /bin/bash: line 31: 5225 Killed > ${_STDCI_TIMEOUT_CMD} "3h" "$script_path" < /dev/null > > If I run 'engine-config --set VdsmUseNmstate=true' on my > 20191204120550.git04d5d05 engine, it returns quickly. > > Tried also adding a repo pointing at last successful run of [7], which > is currently [8], and it prompts me to input a version, probably as a > result of [9]. Ales/Martin, can you please have a look? Thanks.
Something like this might be enough, please take over: https://gerrit.ovirt.org/105784 But the main point of my mail was the first points. > > [1] https://resources.ovirt.org/pub/ovirt-master-snapshot/rpm/el7/noarch/ > [2] https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/ > [3] https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/17768/ > [4] https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/17761/ > [5] > https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/17768/artifact/basic-suite.el7.x86_64/lago_logs/lago.log > [6] > https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/17761/artifact/basic-suite.el7.x86_64/mock_logs/script/stdout_stderr.log > [7] https://jenkins.ovirt.org/job/ovirt-engine_standard-on-merge/ > [8] https://jenkins.ovirt.org/job/ovirt-engine_standard-on-merge/384/ > [9] https://gerrit.ovirt.org/105440 > -- > Didi -- Didi _______________________________________________ Infra mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/4TEQYJOB67NCPO7MNV2JEKDXRV5KZTVU/
