Re: Should we have a predictable test run order?

Ismaël Mejía Wed, 31 Jan 2018 12:32:32 -0800

Is the conclusion of this thread is that we should then make the test
execution random, remember that currently it uses the default order
that is filesystem-based as Dan mentioned and that produces some minor
inconsistencies between mac/linux.


It is going to be interesting to see how much extra flakiness we find
just by defaulting to -Dsurefire.runOrder=random, any volunteer to
open pandora's box?



On Tue, Jan 30, 2018 at 7:30 PM, Reuven Lax <[email protected]> wrote:
> To expand on what Robert says, many other things in our test framework are
> randomized. e.g. PCollection elements are shuffled randomly, bundle sizes
> are determined randomly, etc. All of this should be repeatable if there's a
> failure. The test should print the seed used to generate the random numbers,
> and you should be able to pass that seed back into the run to recreate those
> exact conditions.
>
> On Tue, Jan 30, 2018 at 10:27 AM, Robert Bradshaw <[email protected]>
> wrote:
>>
>> Agreed, any leakage of state between tests is a bug, and giving things
>> a deterministic order just hides these bugs. I'd be in favor of
>> enforcing random ordering (with a published seed for reproduciblity of
>> course).
>>
>> On Tue, Jan 30, 2018 at 9:21 AM, Lukasz Cwik <[email protected]> wrote:
>> > The order should be random to ferret out issues but the test order seed
>> > should be printed and configurable so it allows replaying a test run
>> > because
>> > you can specify the order in which it should execute.
>> >
>> > I don't like having a strict order since it hides poorly written tests
>> > and
>> > people have a tendency to just work around the poorly written test
>> > instead
>> > of fixing it.
>> >
>> > On Tue, Jan 30, 2018 at 9:13 AM, Kenneth Knowles <[email protected]> wrote:
>> >>
>> >> What was the problem in this case?
>> >>
>> >> On Tue, Jan 30, 2018 at 9:12 AM, Romain Manni-Bucau
>> >> <[email protected]> wrote:
>> >>>
>> >>> What I was used to do is to capture the output when I identified some
>> >>> of
>> >>> these cases. Once it is reproduced I grep the "Running" lines from
>> >>> surefire.
>> >>> This gives me a reproducible order. Then with a kind of dichotomy you
>> >>> can
>> >>> find the "previous" test making your test failing and you can
>> >>> configure this
>> >>> sequence in idea.
>> >>>
>> >>> Not perfect but better than hiding the issue probably.
>> >>>
>> >>> Also running "clean" enforces inodes to change and increase the
>> >>> probability to reproduce it on linux.
>> >>>
>> >>>
>> >>> Romain Manni-Bucau
>> >>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn
>> >>>
>> >>> 2018-01-30 18:03 GMT+01:00 Daniel Kulp <[email protected]>:
>> >>>>
>> >>>> The biggest problem with random is that if a test fails due to an
>> >>>> interaction, you have no way to reproduce it.   You could re-run with
>> >>>> random
>> >>>> 10 times and it might not fail again.   Thus, what good did it do to
>> >>>> even
>> >>>> flag the failure?  At least with alphabetical and reverse
>> >>>> alphabetical, if a
>> >>>> tests fails, you can rerun and actually have a chance to diagnose the
>> >>>> failure.   A test that randomly fails once out of every 20 times it
>> >>>> runs
>> >>>> tends to get @Ignored, not fixed.   I’ve seen that way too often.  :(
>> >>>>
>> >>>> Dan
>> >>>>
>> >>>>
>> >>>> > On Jan 30, 2018, at 11:38 AM, Romain Manni-Bucau
>> >>>> > <[email protected]> wrote:
>> >>>> >
>> >>>> > Hi Daniel,
>> >>>> >
>> >>>> > As a quick fix it sounds good but doesnt it hide a leak or issue
>> >>>> > (in
>> >>>> > test setup or in main code)? Long story short: using a random order
>> >>>> > can
>> >>>> > allow to find bugs faster instead of hiding them and discover them
>> >>>> > randomly
>> >>>> > adding a new test.
>> >>>> >
>> >>>> > That said, good point to have it configurable with a -D or -P and
>> >>>> > be
>> >>>> > able to test quickly this flag.
>> >>>> >
>> >>>> >
>> >>>> > Le 30 janv. 2018 17:33, "Daniel Kulp" <[email protected]> a écrit :
>> >>>> > I spent a couple hours this morning trying to figure out why two of
>> >>>> > the SQL tests are failing on my machine, but not for Jenkins or for
>> >>>> > JB.
>> >>>> > Not knowing anything about the SQL stuff, it was very hard to debug
>> >>>> > and it
>> >>>> > wouldn’t fail within Eclipse or even if I ran that individual test
>> >>>> > from the
>> >>>> > command line with -Dtest= .   Thus, a real pain…
>> >>>> >
>> >>>> > It turns out, there is an interaction problem between it and a test
>> >>>> > that is running before it on my machine, but on Jenkins and JB’s
>> >>>> > machine,
>> >>>> > the tests are run in a different order so the problem doesn’t
>> >>>> > surface.   So
>> >>>> > here’s the question:
>> >>>> >
>> >>>> > Should the surefire configuration specify a “runOrder” so that the
>> >>>> > tests would run the same on all of our machines?   By default, the
>> >>>> > runOrder
>> >>>> > is “filesystem” so depending on the order that the filesystem
>> >>>> > returns the
>> >>>> > test classes to surefire, the tests would run in different order.
>> >>>> > It looks
>> >>>> > like my APFS Mac returns them in a different order than JB’s Linux.
>> >>>> > But
>> >>>> > that also means if there is a Jenkins test failure or similar, I
>> >>>> > might not
>> >>>> > be able to reproduce it.   (Or a Windows person or even a Linux
>> >>>> > user using a
>> >>>> > different fs than Jenkins)   For most of the projects I use, we
>> >>>> > generally
>> >>>> > have “<runOrder>alphabetical</runOrder>” to make things completely
>> >>>> > predictable.   That said, by making things non-deterministic, it
>> >>>> > can find
>> >>>> > issues like this where tests aren’t cleaning themselves up
>> >>>> > correctly.
>> >>>> > Could do a runOrder=hourly to flip back and forth between
>> >>>> > alphabetical and
>> >>>> > reverse-alphabetical.  Predictable, but changes to detect issues.
>> >>>> >
>> >>>> > Thoughts?
>> >>>> >
>> >>>> >
>> >>>> > --
>> >>>> > Daniel Kulp
>> >>>> > [email protected] - http://dankulp.com/blog
>> >>>> > Talend Community Coder - http://coders.talend.com
>> >>>> >
>> >>>>
>> >>>> --
>> >>>> Daniel Kulp
>> >>>> [email protected] - http://dankulp.com/blog
>> >>>> Talend Community Coder - http://coders.talend.com
>> >>>>
>> >>>
>> >>
>> >
>
>

Re: Should we have a predictable test run order?

Reply via email to