On Fri, Jul 20, 2018 at 10:36 AM Thanh Ha <thanh...@linuxfoundation.org>
wrote:

> On Fri, Jul 20, 2018 at 10:01 AM Tom Pantelis <tompante...@gmail.com>
> wrote:
>
>> On Fri, Jul 20, 2018 at 4:48 AM, Anil Belur <abe...@linuxfoundation.org>
>> wrote:
>>
>>> On Fri, Jul 20, 2018 at 11:12 AM Jenkins <
>>> jenkins-dontre...@opendaylight.org> wrote:
>>>
>>>> Attention controller-devs,
>>>>
>>>> Autorelease oxygen failed to build sal-cluster-admin-impl from
>>>> controller in build
>>>> 359. Attached is a snippet of the error message related to the
>>>> failure that we were able to automatically parse as well as console
>>>> logs.
>>>>
>>>> Console Logs:
>>>>
>>>> https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/autorelease-release-oxygen/359
>>>>
>>>> Jenkins Build:
>>>>
>>>> https://jenkins.opendaylight.org/releng/job/autorelease-release-oxygen/359/
>>>>
>>>> Please review and provide an ETA on when a fix will be available.
>>>>
>>>> Thanks,
>>>> ODL releng/autorelease team
>>>>
>>>  Hello controller-dev:
>>>
>>> Please look into these failed tests.
>>>
>>> Failed tests:
>>>
>>> ClusterAdminRpcServiceTest.testFlipMemberVotingStates:976->lambda$testFlipMemberVotingStates$8:978
>>> Expected leader member-1. Actual:
>>> member-1-shard-cars-oper_testFlipMemberVotingStates
>>>
>>> Tests run: 17, Failures: 1, Errors: 0, Skipped: 0
>>>
>>
>>
>> I ran it successfully 500 times locally. But looking at the code and the
>> test output from jenkins, I can see why it failed - just the right
>> timing sequence coupled with just enough of a random thread execution delay
>> and a deadline timeout set by the test being just a tad too low for that
>> delay.  I'll push a patch. Another case where occasionally it seems there's
>> just enough of a slight delay or slowdown in the jenkins environment to
>> throw off timing to cause a test failure.
>>
>
> Hi Tom,
>
> I'm curious when you said you ran it successfully 500 times locally did
> you perform a full build during that time or tested the single test case in
> isolation?
>
> I found that while troubleshooting the bgpcep issue in the bgp-bmp-mock
> thread [0] that I had to run a full bgpcep build in order to reproduce the
> issue on my own laptop system. I have a script that I'm testing now and
> making it more generic that I will share to this list later which will
> allow us to continuously run builds whether it's autorelease or project
> specifc over and over infinitely and capture the maven output + surefire
> logs output which I hope will help folks reproduce intermittent issues
> locally.
>
> I feel like blaming infrastructure being "slow" is too easy an excuse for
> issues. If the software was run in a customer production environment I
> suspect telling the customer that their hardware is too slow and is not the
> same hardware as the developer's laptop it would not be a solution the
> customer would be happy with.
>

+1000

Our code and tests need to be robust enough to handle diverse
infrastructure. Bugs like this might be highlighted by infra variability,
but they are still bugs in code/tests.

Not picking on TomP or Controller here, this is a general ODL culture
problem of blaming the infra first and until Thanh/Jamo/et al prove
otherwise.


> I'm not sure what we can do to help give more confidence in the
> infrastructure so that it's not the first thing that gets blamed every time
> there's a build issue but we do run on build flavors in vexxhost that
> provide dedicated CPUs and RAM to our builders. Once I have some more
> validation on the infinite build script maybe I can run it for awhile on
> every autorelease managed project and report to the projects with the
> script output on my 2 laptops + a few vexxhost instances.
>

Thanks for this work Thanh, seems like it will be very helpful for ironing
out intermittent failures.

Daniel


>
> Regards,
> Thanh
>
> [0] https://lists.opendaylight.org/pipermail/release/2018-July/015594.html
>
> _______________________________________________
> release mailing list
> rele...@lists.opendaylight.org
> https://lists.opendaylight.org/mailman/listinfo/release
>
_______________________________________________
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev

Reply via email to