Re: Should we have a predictable test run order?

Romain Manni-Bucau Wed, 31 Jan 2018 12:35:33 -0800

2018-01-31 21:31 GMT+01:00 Ismaël Mejía <[email protected]>:

> Is the conclusion of this thread is that we should then make the test
> execution random, remember that currently it uses the default order
> that is filesystem-based as Dan mentioned and that produces some minor
> inconsistencies between mac/linux.
>
> It is going to be interesting to see how much extra flakiness we find
> just by defaulting to -Dsurefire.runOrder=random, any volunteer to
> open pandora's box?
>


Hehe, didn't you just get designed volunteer? ;)
Anyway, since it uses inodes ATM it is quite random on UNIx at least.


>
>
>
> On Tue, Jan 30, 2018 at 7:30 PM, Reuven Lax <[email protected]> wrote:
> > To expand on what Robert says, many other things in our test framework
> are
> > randomized. e.g. PCollection elements are shuffled randomly, bundle sizes
> > are determined randomly, etc. All of this should be repeatable if
> there's a
> > failure. The test should print the seed used to generate the random
> numbers,
> > and you should be able to pass that seed back into the run to recreate
> those
> > exact conditions.
> >
> > On Tue, Jan 30, 2018 at 10:27 AM, Robert Bradshaw <[email protected]>
> > wrote:
> >>
> >> Agreed, any leakage of state between tests is a bug, and giving things
> >> a deterministic order just hides these bugs. I'd be in favor of
> >> enforcing random ordering (with a published seed for reproduciblity of
> >> course).
> >>
> >> On Tue, Jan 30, 2018 at 9:21 AM, Lukasz Cwik <[email protected]> wrote:
> >> > The order should be random to ferret out issues but the test order
> seed
> >> > should be printed and configurable so it allows replaying a test run
> >> > because
> >> > you can specify the order in which it should execute.
> >> >
> >> > I don't like having a strict order since it hides poorly written tests
> >> > and
> >> > people have a tendency to just work around the poorly written test
> >> > instead
> >> > of fixing it.
> >> >
> >> > On Tue, Jan 30, 2018 at 9:13 AM, Kenneth Knowles <[email protected]>
> wrote:
> >> >>
> >> >> What was the problem in this case?
> >> >>
> >> >> On Tue, Jan 30, 2018 at 9:12 AM, Romain Manni-Bucau
> >> >> <[email protected]> wrote:
> >> >>>
> >> >>> What I was used to do is to capture the output when I identified
> some
> >> >>> of
> >> >>> these cases. Once it is reproduced I grep the "Running" lines from
> >> >>> surefire.
> >> >>> This gives me a reproducible order. Then with a kind of dichotomy
> you
> >> >>> can
> >> >>> find the "previous" test making your test failing and you can
> >> >>> configure this
> >> >>> sequence in idea.
> >> >>>
> >> >>> Not perfect but better than hiding the issue probably.
> >> >>>
> >> >>> Also running "clean" enforces inodes to change and increase the
> >> >>> probability to reproduce it on linux.
> >> >>>
> >> >>>
> >> >>> Romain Manni-Bucau
> >> >>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn
> >> >>>
> >> >>> 2018-01-30 18:03 GMT+01:00 Daniel Kulp <[email protected]>:
> >> >>>>
> >> >>>> The biggest problem with random is that if a test fails due to an
> >> >>>> interaction, you have no way to reproduce it.   You could re-run
> with
> >> >>>> random
> >> >>>> 10 times and it might not fail again.   Thus, what good did it do
> to
> >> >>>> even
> >> >>>> flag the failure?  At least with alphabetical and reverse
> >> >>>> alphabetical, if a
> >> >>>> tests fails, you can rerun and actually have a chance to diagnose
> the
> >> >>>> failure.   A test that randomly fails once out of every 20 times it
> >> >>>> runs
> >> >>>> tends to get @Ignored, not fixed.   I’ve seen that way too often.
> :(
> >> >>>>
> >> >>>> Dan
> >> >>>>
> >> >>>>
> >> >>>> > On Jan 30, 2018, at 11:38 AM, Romain Manni-Bucau
> >> >>>> > <[email protected]> wrote:
> >> >>>> >
> >> >>>> > Hi Daniel,
> >> >>>> >
> >> >>>> > As a quick fix it sounds good but doesnt it hide a leak or issue
> >> >>>> > (in
> >> >>>> > test setup or in main code)? Long story short: using a random
> order
> >> >>>> > can
> >> >>>> > allow to find bugs faster instead of hiding them and discover
> them
> >> >>>> > randomly
> >> >>>> > adding a new test.
> >> >>>> >
> >> >>>> > That said, good point to have it configurable with a -D or -P and
> >> >>>> > be
> >> >>>> > able to test quickly this flag.
> >> >>>> >
> >> >>>> >
> >> >>>> > Le 30 janv. 2018 17:33, "Daniel Kulp" <[email protected]> a
> écrit :
> >> >>>> > I spent a couple hours this morning trying to figure out why two
> of
> >> >>>> > the SQL tests are failing on my machine, but not for Jenkins or
> for
> >> >>>> > JB.
> >> >>>> > Not knowing anything about the SQL stuff, it was very hard to
> debug
> >> >>>> > and it
> >> >>>> > wouldn’t fail within Eclipse or even if I ran that individual
> test
> >> >>>> > from the
> >> >>>> > command line with -Dtest= .   Thus, a real pain…
> >> >>>> >
> >> >>>> > It turns out, there is an interaction problem between it and a
> test
> >> >>>> > that is running before it on my machine, but on Jenkins and JB’s
> >> >>>> > machine,
> >> >>>> > the tests are run in a different order so the problem doesn’t
> >> >>>> > surface.   So
> >> >>>> > here’s the question:
> >> >>>> >
> >> >>>> > Should the surefire configuration specify a “runOrder” so that
> the
> >> >>>> > tests would run the same on all of our machines?   By default,
> the
> >> >>>> > runOrder
> >> >>>> > is “filesystem” so depending on the order that the filesystem
> >> >>>> > returns the
> >> >>>> > test classes to surefire, the tests would run in different order.
> >> >>>> > It looks
> >> >>>> > like my APFS Mac returns them in a different order than JB’s
> Linux.
> >> >>>> > But
> >> >>>> > that also means if there is a Jenkins test failure or similar, I
> >> >>>> > might not
> >> >>>> > be able to reproduce it.   (Or a Windows person or even a Linux
> >> >>>> > user using a
> >> >>>> > different fs than Jenkins)   For most of the projects I use, we
> >> >>>> > generally
> >> >>>> > have “<runOrder>alphabetical</runOrder>” to make things
> completely
> >> >>>> > predictable.   That said, by making things non-deterministic, it
> >> >>>> > can find
> >> >>>> > issues like this where tests aren’t cleaning themselves up
> >> >>>> > correctly.
> >> >>>> > Could do a runOrder=hourly to flip back and forth between
> >> >>>> > alphabetical and
> >> >>>> > reverse-alphabetical.  Predictable, but changes to detect issues.
> >> >>>> >
> >> >>>> > Thoughts?
> >> >>>> >
> >> >>>> >
> >> >>>> > --
> >> >>>> > Daniel Kulp
> >> >>>> > [email protected] - http://dankulp.com/blog
> >> >>>> > Talend Community Coder - http://coders.talend.com
> >> >>>> >
> >> >>>>
> >> >>>> --
> >> >>>> Daniel Kulp
> >> >>>> [email protected] - http://dankulp.com/blog
> >> >>>> Talend Community Coder - http://coders.talend.com
> >> >>>>
> >> >>>
> >> >>
> >> >
> >
> >
>

Re: Should we have a predictable test run order?

Reply via email to