[JIRA] (OVIRT-2498) Failing KubeVirt CI

Petr Kotas (oVirt JIRA) Mon, 17 Sep 2018 07:02:00 -0700

    [ 
https://ovirt-jira.atlassian.net/browse/OVIRT-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=38040#comment-38040
 ]


Petr Kotas commented on OVIRT-2498:
-----------------------------------

[[email protected]] I would like to see the machine resources. The use of 
CPU, the use of RAM to understand how the tests behave live. I am not sure 
whether this is working on Blue Ocean.

WRT the docker, as I already pointer in the logs. The issue is way before our 
setup even kicks in.
[Here|https://jenkins.ovirt.org/blue/organizations/jenkins/kubevirt_kubevirt_standard-check-pr/detail/kubevirt_kubevirt_standard-check-pr/1916/pipeline/122]
 is the direct link for that.
It seems that the jenkins project_setup.sh fails somehow. Again this is not our 
code, it is part of standard ci located 
[here|https://gerrit.ovirt.org/gitweb?p=jenkins.git;a=blob;f=jobs/confs/shell-scripts/project_setup.sh;h=7009f8e4b04a41d5817014acff6589f7bb978c8a;hb=HEAD].
It seems that the project setup, was doing its job and than randomly failed due 
to networking issue. I have no idea why.

Also I do not thing the issue is due to proxy as the failures are totally 
random on random tests.
So I am guessing something more hidden fails.

> Failing KubeVirt CI
> -------------------
>
>                 Key: OVIRT-2498
>                 URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2498
>             Project: oVirt - virtualization made easy
>          Issue Type: By-EMAIL
>            Reporter: Petr Kotas
>            Assignee: infra
>
> Hi,
> I am working on fixing the issues on the KubeVirt e2e test suites. This
> task is directly related to unstable CI, due to unknown errors.
> The progress is reported in the CNV trello:
> https://trello.com/c/HNXcMEQu/161-epic-improve-ci
> I am creating this issue since the KubeVirt experience random timeouts on
> random tests most of the times when test suites run.
> The issue from outside is showing as timeouts on difference part of tests.
> Sometimes the tests fails in set up phase, again due to random timeout.
> The example in the link bellow timed out for network connection on
> localhost.
> [check-patch.k8s-1.11.0-dev.el7.x86_64]
> requests.exceptions.ReadTimeout:
> UnixHTTPConnectionPool(host='localhost', port=None): Read timed out.
> (read timeout=60)
> Example of failing test suites is here
> https://jenkins.ovirt.org/job/kubevirt_kubevirt_standard-check-pr/1916/consoleText
> The list of errors related to the failing CI can be found in my notes
> https://docs.google.com/document/d/1_ll1DOMHgCRHn_Df9i4uvtRFyMK-bDCHEeGfJFTjvjU/edit#heading=h.vcfoo8hi48ul
> I am not sure whether KubeVirt already shared the resource requirements, so
> I provide short summary:
> *Resources for KubeVirt e2e tests:*
>    - at least 12GB of RAM - we start 3 nodes (3 docker images) each require
>    4GB of RAM
>    - exposed /dev/kvm to enable native virtualization
>    - cached images, since these are used to build the test cluster:
>       - kubevirtci/os-3.10.0-crio:latest
>       - kubevirtci/os-3.10.0-multus:latest
>       - kubevirtci/os-3.10.0:latest
>       - kubevirtci/k8s-1.10.4:latest
>       - kubevirtci/k8s-multus-1.11.1:latest
>       - kubevirtci/k8s-1.11.0:latest
> How can we overcome this? Can we work together to build a suitable
> requirements for running the tests so it passes each time?
> Kind regards,
> Petr Kotas



--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100092)
_______________________________________________
Infra mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/EIFI72MSNXJVZA7ENEKXXBPNQLPO2IU7/

[JIRA] (OVIRT-2498) Failing KubeVirt CI

Reply via email to