On 11/6/18 6:00 PM, Jamo Luhrsen wrote:


On 11/5/18 6:31 PM, Thanh Ha wrote:
Adding integration-dev as I don't think this is an infra issue. I did some 
triaging and found the following:

1. Robot vm is still running and ssh is still accessible when failure occurs
2. CPU / RAM / Storage all sufficient during failure

What is happening however is something is causing the Jenkins Java SSH connection to close at exactly 14 minutes into the job every time and thus causing Jenkins to believe the VM is now no longer reachable.

I'm suspicious that something is happening in the robot run that is breaking the Jenkins SSH connection. Has any other projects seen this same failure in their CSIT jobs too?


Yeah, I saw this is a job and dismissed it as an infra instability and
didn't look any deeper. I think it was a 3node netvirt csit job.

I will keep an eye out for it happening more.

here it is in a 1node controller CSIT job:
https://jenkins.opendaylight.org/releng/job/controller-csit-1node-benchmark-all-fluorine/213/console

JamO


JamO



Regards,
Thanh


On Tue, Nov 6, 2018 at 8:53 AM Thanh Ha <[email protected] 
<mailto:[email protected]>> wrote:

    Hi Lori,

    Sounds like a problem that might be difficult to sort out but I'll poke at this today and see if I can find some clues.

    Regards,
    Thanh

    On Mon, Nov 5, 2018 at 4:51 PM Lori Jakab <[email protected] 
<mailto:lorand.jakab%[email protected]>> wrote:

        [adding helpdesk, not sure who is monitoring infrastructure@]

        On Fri, Nov 2, 2018 at 2:02 PM Lori Jakab <[email protected] <mailto:lorand.jakab%[email protected]>> wrote:
         >
         > Hi,
         >
         > For a while the lispflowmapping performance tests on Jenkins have 
been
         > failing, first intermittently, but now the Neon and Oxygen tests fail
         > almost always:
         >
         >
https://jenkins.opendaylight.org/releng/view/lispflowmapping/job/lispflowmapping-csit-1node-performance-only-neon/
         >
https://jenkins.opendaylight.org/releng/view/lispflowmapping/job/lispflowmapping-csit-1node-performance-only-fluorine/
         >
https://jenkins.opendaylight.org/releng/view/lispflowmapping/job/lispflowmapping-csit-1node-performance-only-oxygen/
         >
         > This is the error message that I found most likely to be useful:
         > "Caused: java.io.IOException: Backing channel
         > 'prd-centos7-robot-2c-8g-42785' is disconnected." see the bottom of
         > the full console log:
         >
         >
https://jenkins.opendaylight.org/releng/view/lispflowmapping/job/lispflowmapping-csit-1node-performance-only-neon/90/console
         >
         > Performance jobs use two VMs for the tests, and it looks like during
         > the tests the connection from the main

        Request Test Traffic                                     FATAL: command 
execution failed

        java.io.EOFException

          VM to the slave is broken. I
         > couldn't find any clues for the root of the problem in these logs.
         >
         > Any ideas on how to fix this? Unless the problem is fixed, these 
tests
         > just waste infra resources, so the sensible thing to do would be to
         > disable them, which is not the outcome I would prefer. The only other
         > project that seems to still have performance tests is SXP, their 
tests
         > at least finish, but not without failures, so I don't know how much
         > they are affected by this issue. MDSAL used to have performance tests
         > too, but I cant find them anymore.
         >
         > -Lori
        _______________________________________________
        infrastructure mailing list
        [email protected] 
<mailto:[email protected]>
        https://lists.opendaylight.org/mailman/listinfo/infrastructure


_______________________________________________
integration-dev mailing list
[email protected]
https://lists.opendaylight.org/mailman/listinfo/integration-dev

_______________________________________________
infrastructure mailing list
[email protected]
https://lists.opendaylight.org/mailman/listinfo/infrastructure

Reply via email to