On Tue, Dec 17, 2019 at 10:11 AM Yedidyah Bar David <[email protected]> wrote:
>
> Hi all,
>
> $subject. [1] has
> ovirt-engine-4.4.0-0.0.master.20191204120550.git04d5d05.el7.noarch .
>
> Tried to look around, and I have a few notes/questions:
>
> 1. Last successful run of [2] is 3 days old, but apparently it wasn't
> published. Any idea why?
>
> 2. Failed runs of [2] are reported to infra, with emails such as:
>
> [CQ]: 105472, 5 (ovirt-engine) failed "ovirt-master" system tests, but
> isn't the failure root cause
>
> Is anyone monitoring these?
>
> Is this the only alerting that CI generates on such failures?
>
> If first is No and second is Yes, then we need someone/something to
> start monitoring. This was discussed a lot, but I do not see any
> change. Ideally, such alerts should be To'ed or Cc'ed to the author
> and reviewers of the patch that CI found to be guilty (which might be
> wrong, that's not the point). Do we plan to have something like this?
> Any idea when it will be ready?
>
> 3. I looked at a few recent failures of [2], specifically [3][4]. Both
> seem to have been killed after a timeout, while running
> 'engine-config'. For [3] that's clear, see [5]:
>
> 2019-12-16 17:11:44,766::log_utils.py::__exit__::611::lago.ssh::DEBUG::end
> task:fb6611dc-55bb-4251-aeda-2578b2ec83a2:Get ssh client for
> lago-basic-suite-master-engine:
> 2019-12-16 17:11:44,931::ssh.py::ssh::58::lago.ssh::DEBUG::Running
> 22e2b6b6 on lago-basic-suite-master-engine: engine-config --set
> VdsmUseNmstate=true
> 2019-12-16 19:55:21,965::cmd.py::exit_handler::921::cli::DEBUG::signal
> 15 was caught
>
> Can't find stdout/stderr of engine-config, so it's hard to tell if it
> outputted anything helpful to understand why it was stuck.
>
> It's hard to tell that about [4], because it has very few artifacts
> collected, no idea why, notably no lago.log, but [6] does show:
>
>  [36m  # initialize_engine:  [32mSuccess [0m (in 0:04:00) [0m
>  [36m  # engine_config:  [0m [0m [0m
>  [36m    * Collect artifacts:  [0m [0m [0m
>  [36m      - [Thread-34] lago-basic-suite-master-engine:
>  [31mERROR [0m (in 0:00:04) [0m
>  [36m    * Collect artifacts:  [31mERROR [0m (in 0:00:04) [0m
>  [36m  # engine_config:  [31mERROR [0m (in 2:42:57) [0m
> /bin/bash: line 31:  5225 Killed
> ${_STDCI_TIMEOUT_CMD} "3h" "$script_path" < /dev/null
>
> If I run 'engine-config --set VdsmUseNmstate=true' on my
> 20191204120550.git04d5d05 engine, it returns quickly.
>
> Tried also adding a repo pointing at last successful run of [7], which
> is currently [8], and it prompts me to input a version, probably as a
> result of [9]. Ales/Martin, can you please have a look? Thanks.

Something like this might be enough, please take over:

https://gerrit.ovirt.org/105784

But the main point of my mail was the first points.

>
> [1] https://resources.ovirt.org/pub/ovirt-master-snapshot/rpm/el7/noarch/
> [2] https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/
> [3] https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/17768/
> [4] https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/17761/
> [5] 
> https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/17768/artifact/basic-suite.el7.x86_64/lago_logs/lago.log
> [6] 
> https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/17761/artifact/basic-suite.el7.x86_64/mock_logs/script/stdout_stderr.log
> [7] https://jenkins.ovirt.org/job/ovirt-engine_standard-on-merge/
> [8] https://jenkins.ovirt.org/job/ovirt-engine_standard-on-merge/384/
> [9] https://gerrit.ovirt.org/105440
> --
> Didi



-- 
Didi
_______________________________________________
Infra mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/4TEQYJOB67NCPO7MNV2JEKDXRV5KZTVU/

Reply via email to