2018-01-31 21:31 GMT+01:00 Ismaël Mejía <[email protected]>: > Is the conclusion of this thread is that we should then make the test > execution random, remember that currently it uses the default order > that is filesystem-based as Dan mentioned and that produces some minor > inconsistencies between mac/linux. > > It is going to be interesting to see how much extra flakiness we find > just by defaulting to -Dsurefire.runOrder=random, any volunteer to > open pandora's box? >
Hehe, didn't you just get designed volunteer? ;) Anyway, since it uses inodes ATM it is quite random on UNIx at least. > > > > On Tue, Jan 30, 2018 at 7:30 PM, Reuven Lax <[email protected]> wrote: > > To expand on what Robert says, many other things in our test framework > are > > randomized. e.g. PCollection elements are shuffled randomly, bundle sizes > > are determined randomly, etc. All of this should be repeatable if > there's a > > failure. The test should print the seed used to generate the random > numbers, > > and you should be able to pass that seed back into the run to recreate > those > > exact conditions. > > > > On Tue, Jan 30, 2018 at 10:27 AM, Robert Bradshaw <[email protected]> > > wrote: > >> > >> Agreed, any leakage of state between tests is a bug, and giving things > >> a deterministic order just hides these bugs. I'd be in favor of > >> enforcing random ordering (with a published seed for reproduciblity of > >> course). > >> > >> On Tue, Jan 30, 2018 at 9:21 AM, Lukasz Cwik <[email protected]> wrote: > >> > The order should be random to ferret out issues but the test order > seed > >> > should be printed and configurable so it allows replaying a test run > >> > because > >> > you can specify the order in which it should execute. > >> > > >> > I don't like having a strict order since it hides poorly written tests > >> > and > >> > people have a tendency to just work around the poorly written test > >> > instead > >> > of fixing it. > >> > > >> > On Tue, Jan 30, 2018 at 9:13 AM, Kenneth Knowles <[email protected]> > wrote: > >> >> > >> >> What was the problem in this case? > >> >> > >> >> On Tue, Jan 30, 2018 at 9:12 AM, Romain Manni-Bucau > >> >> <[email protected]> wrote: > >> >>> > >> >>> What I was used to do is to capture the output when I identified > some > >> >>> of > >> >>> these cases. Once it is reproduced I grep the "Running" lines from > >> >>> surefire. > >> >>> This gives me a reproducible order. Then with a kind of dichotomy > you > >> >>> can > >> >>> find the "previous" test making your test failing and you can > >> >>> configure this > >> >>> sequence in idea. > >> >>> > >> >>> Not perfect but better than hiding the issue probably. > >> >>> > >> >>> Also running "clean" enforces inodes to change and increase the > >> >>> probability to reproduce it on linux. > >> >>> > >> >>> > >> >>> Romain Manni-Bucau > >> >>> @rmannibucau | Blog | Old Blog | Github | LinkedIn > >> >>> > >> >>> 2018-01-30 18:03 GMT+01:00 Daniel Kulp <[email protected]>: > >> >>>> > >> >>>> The biggest problem with random is that if a test fails due to an > >> >>>> interaction, you have no way to reproduce it. You could re-run > with > >> >>>> random > >> >>>> 10 times and it might not fail again. Thus, what good did it do > to > >> >>>> even > >> >>>> flag the failure? At least with alphabetical and reverse > >> >>>> alphabetical, if a > >> >>>> tests fails, you can rerun and actually have a chance to diagnose > the > >> >>>> failure. A test that randomly fails once out of every 20 times it > >> >>>> runs > >> >>>> tends to get @Ignored, not fixed. I’ve seen that way too often. > :( > >> >>>> > >> >>>> Dan > >> >>>> > >> >>>> > >> >>>> > On Jan 30, 2018, at 11:38 AM, Romain Manni-Bucau > >> >>>> > <[email protected]> wrote: > >> >>>> > > >> >>>> > Hi Daniel, > >> >>>> > > >> >>>> > As a quick fix it sounds good but doesnt it hide a leak or issue > >> >>>> > (in > >> >>>> > test setup or in main code)? Long story short: using a random > order > >> >>>> > can > >> >>>> > allow to find bugs faster instead of hiding them and discover > them > >> >>>> > randomly > >> >>>> > adding a new test. > >> >>>> > > >> >>>> > That said, good point to have it configurable with a -D or -P and > >> >>>> > be > >> >>>> > able to test quickly this flag. > >> >>>> > > >> >>>> > > >> >>>> > Le 30 janv. 2018 17:33, "Daniel Kulp" <[email protected]> a > écrit : > >> >>>> > I spent a couple hours this morning trying to figure out why two > of > >> >>>> > the SQL tests are failing on my machine, but not for Jenkins or > for > >> >>>> > JB. > >> >>>> > Not knowing anything about the SQL stuff, it was very hard to > debug > >> >>>> > and it > >> >>>> > wouldn’t fail within Eclipse or even if I ran that individual > test > >> >>>> > from the > >> >>>> > command line with -Dtest= . Thus, a real pain… > >> >>>> > > >> >>>> > It turns out, there is an interaction problem between it and a > test > >> >>>> > that is running before it on my machine, but on Jenkins and JB’s > >> >>>> > machine, > >> >>>> > the tests are run in a different order so the problem doesn’t > >> >>>> > surface. So > >> >>>> > here’s the question: > >> >>>> > > >> >>>> > Should the surefire configuration specify a “runOrder” so that > the > >> >>>> > tests would run the same on all of our machines? By default, > the > >> >>>> > runOrder > >> >>>> > is “filesystem” so depending on the order that the filesystem > >> >>>> > returns the > >> >>>> > test classes to surefire, the tests would run in different order. > >> >>>> > It looks > >> >>>> > like my APFS Mac returns them in a different order than JB’s > Linux. > >> >>>> > But > >> >>>> > that also means if there is a Jenkins test failure or similar, I > >> >>>> > might not > >> >>>> > be able to reproduce it. (Or a Windows person or even a Linux > >> >>>> > user using a > >> >>>> > different fs than Jenkins) For most of the projects I use, we > >> >>>> > generally > >> >>>> > have “<runOrder>alphabetical</runOrder>” to make things > completely > >> >>>> > predictable. That said, by making things non-deterministic, it > >> >>>> > can find > >> >>>> > issues like this where tests aren’t cleaning themselves up > >> >>>> > correctly. > >> >>>> > Could do a runOrder=hourly to flip back and forth between > >> >>>> > alphabetical and > >> >>>> > reverse-alphabetical. Predictable, but changes to detect issues. > >> >>>> > > >> >>>> > Thoughts? > >> >>>> > > >> >>>> > > >> >>>> > -- > >> >>>> > Daniel Kulp > >> >>>> > [email protected] - http://dankulp.com/blog > >> >>>> > Talend Community Coder - http://coders.talend.com > >> >>>> > > >> >>>> > >> >>>> -- > >> >>>> Daniel Kulp > >> >>>> [email protected] - http://dankulp.com/blog > >> >>>> Talend Community Coder - http://coders.talend.com > >> >>>> > >> >>> > >> >> > >> > > > > > >
