On 1/18/21 9:58 AM, Yedidyah Bar David wrote:
On Mon, Jan 18, 2021 at 10:53 AM Martin Perina <[email protected]> wrote:
On Mon, Jan 18, 2021 at 9:08 AM Yedidyah Bar David <[email protected]> wrote:
On Sun, Jan 17, 2021 at 3:11 PM Yedidyah Bar David <[email protected]> wrote:
On Thu, Jan 14, 2021 at 1:41 PM Yedidyah Bar David <[email protected]> wrote:
On Thu, Jan 14, 2021 at 8:35 AM Yedidyah Bar David <[email protected]> wrote:
On Wed, Jan 13, 2021 at 5:34 PM Yedidyah Bar David <[email protected]> wrote:
On Wed, Jan 13, 2021 at 2:48 PM Yedidyah Bar David <[email protected]> wrote:
On Wed, Jan 13, 2021 at 1:57 PM Marcin Sobczyk <[email protected]> wrote:
Hi,
my guess is it's selinux-related.
Unfortunately I can't find any meaningful errors in audit.log in a
scenario where host deployment fails.
However switching selinux to permissive mode before adding hosts makes
the problem go away, so it's probably not an error somewhere in logic.
It's getting weirder: Under strace, it succeeds:
https://gerrit.ovirt.org/c/ovirt-system-tests/+/112948
(Can't see the actual log, as I didn't add '-A', so it was overwritten
on restart...)
After updating it to use '-A' it indeed shows that it worked:
43664 14:16:55.997639 access("/etc/pki/ovirt-engine/requests", W_OK
<unfinished ...>
43664 14:16:55.997695 <... access resumed>) = 0
Weird.
Now ran in parallel 'ci test' for this patch and another one from
master, for comparison:
Again, the same:
https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/14916/
With strace, passed,
https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1883/
Without strace, failed.
Last nightly run that passed [1] used:
ost-images-el8-host-installed-1-202101100446.x86_64
ovirt-engine-appliance-4.4-20210109182828.1.el8.x86_64
Trying now with these - not sure it possible to put specific versions inside
automation/*packages, let's see:
https://gerrit.ovirt.org/c/ovirt-system-tests/+/112977
Indeed, with a fixed ost-images and removing updates, it passes. network suite
failed, but he-basic passed:
https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/14920/artifact/ci_build_summary.html
So I am quite certain this is an OS issue. Not sure how we do not see
this in basic-suite.
Perhaps it's related to nested-kvm, or to load/slowness caused by that? Weird.
when this fails, we do not collect all engine's /var/log, only
messages and ovirt-engine/ .
So it's not easy to get a list of the packages that were updated.
Pushed now:
https://github.com/oVirt/ovirt-ansible-collection/pull/202
to get all of engine's /var/log, and ran manual HE job with it:
https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/7680/
This one I accidentally ran with the wrong repo, then ran another one
with the correct repo [1],
But:
1. The repo wasn't used. Emailed about this a separate thread: "manual
job does not use custom repo"
2. It passed! Being what seems like a heisenbug, I understand why when
you run it under strace it
works differently. But even if you just intend to collect more logs it
also causes it to behave
differently? :-) This does not mean that "problem solved" - latest
nightly run [2] did fail with
the same error.
Status:
1. he-basic-suite is still failing.
2. Patch to collect all of /var/log from the engine merged.
Dana, can you please update? Did you have any progress?
IMO it's an OS bug. If Marcin says it's an selinux issue, I do not argue :-).
So, how do we continue?
Switching to CentOS Stream development/testing is a big effort, I'm not sure we
can do this and still deliver all the RFEs/bugs planned for 4.4.5 ...
+1
IMO we should now revert appliance and node to CentOS 8.3, and then
continue the discussion.
Having he-basic-suite broken for a week is too much.
+1 The testing infrastructure for Stream is here, but if it doesn't work
yet than let's stick to the plan and focus on 8.3.
[1]
https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/7681/
[2] https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1887/
[1] https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1879/
--
Didi
--
Didi
--
Didi
--
Martin Perina
Manager, Software Engineering
Red Hat Czech s.r.o.
_______________________________________________
Devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/[email protected]/message/MHE2YPXUYGDX6IBS265BPXEXLGCEOZWI/