[JIRA] (OVIRT-2498) Failing KubeVirt CI

Barak Korren (oVirt JIRA) Mon, 17 Sep 2018 01:32:51 -0700

    [ 
https://ovirt-jira.atlassian.net/browse/OVIRT-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=38037#comment-38037
 ]


Barak Korren commented on OVIRT-2498:
-------------------------------------

[~pkotas] please be more specific about what you mean by monitoring access. you 
should already be able to monitor the job as it runs in multiple ways via Blue 
Ocean, the Jenkins old UI or the STDCI UI.

WRT Docker connection issues - where is the Docker command being lunched? If 
its launched from inside the containers that Kubevirt-CI creates, then its 
really not something I can control (AFAIK its sets up its own networking inside 
a special "dnsmasq" container). 

If the thing that fails tries to connect directly from the STDCI environemnt to 
some service, it may have to do with the fact that we have a proxy configured 
in the environment. you ned to make sure the connections you're trying to setup 
and not being routed via the proxy by either setting the 'no_proxy' env var or 
unsetting the 'http_proxy' env var. I added code to do this in Kubevirt's 
test.sh a while ago, but maybe someone removed it.

> Failing KubeVirt CI
> -------------------
>
>                 Key: OVIRT-2498
>                 URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2498
>             Project: oVirt - virtualization made easy
>          Issue Type: By-EMAIL
>            Reporter: Petr Kotas
>            Assignee: infra
>
> Hi,
> I am working on fixing the issues on the KubeVirt e2e test suites. This
> task is directly related to unstable CI, due to unknown errors.
> The progress is reported in the CNV trello:
> https://trello.com/c/HNXcMEQu/161-epic-improve-ci
> I am creating this issue since the KubeVirt experience random timeouts on
> random tests most of the times when test suites run.
> The issue from outside is showing as timeouts on difference part of tests.
> Sometimes the tests fails in set up phase, again due to random timeout.
> The example in the link bellow timed out for network connection on
> localhost.
> [check-patch.k8s-1.11.0-dev.el7.x86_64]
> requests.exceptions.ReadTimeout:
> UnixHTTPConnectionPool(host='localhost', port=None): Read timed out.
> (read timeout=60)
> Example of failing test suites is here
> https://jenkins.ovirt.org/job/kubevirt_kubevirt_standard-check-pr/1916/consoleText
> The list of errors related to the failing CI can be found in my notes
> https://docs.google.com/document/d/1_ll1DOMHgCRHn_Df9i4uvtRFyMK-bDCHEeGfJFTjvjU/edit#heading=h.vcfoo8hi48ul
> I am not sure whether KubeVirt already shared the resource requirements, so
> I provide short summary:
> *Resources for KubeVirt e2e tests:*
>    - at least 12GB of RAM - we start 3 nodes (3 docker images) each require
>    4GB of RAM
>    - exposed /dev/kvm to enable native virtualization
>    - cached images, since these are used to build the test cluster:
>       - kubevirtci/os-3.10.0-crio:latest
>       - kubevirtci/os-3.10.0-multus:latest
>       - kubevirtci/os-3.10.0:latest
>       - kubevirtci/k8s-1.10.4:latest
>       - kubevirtci/k8s-multus-1.11.1:latest
>       - kubevirtci/k8s-1.11.0:latest
> How can we overcome this? Can we work together to build a suitable
> requirements for running the tests so it passes each time?
> Kind regards,
> Petr Kotas



--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100092)
_______________________________________________
Infra mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/E72ZF7U3VGZZ3UQVNWULVR64LSQBCSXR/

[JIRA] (OVIRT-2498) Failing KubeVirt CI

Reply via email to