+1 for separating these out and running them with forkEvery 1. I think they should probably still run as part of precheckin and the nightly builds though. We don't want this to turn into essentially disabling and ignoring these tests.
-Dan On Tue, Apr 26, 2016 at 10:28 AM, Kirk Lund <[email protected]> wrote: > Also, I don't think there's much value continuing to use the "CI" label. If > a test fails in Jenkins, then run the test to see if it fails consistently. > If it doesn't, it's flaky. The developer looking at it should try to > determine the cause of it failing (ie, "it uses thread sleeps or random > ports with BindExceptions or has short timeouts with probable GC pause") > and include that info when adding the FlakyTest annotation and filing a > Jira bug with the Flaky label. If the test fails consistently, then file a > Jira bug without the Flaky label. > > -Kirk > > > On Tue, Apr 26, 2016 at 10:24 AM, Kirk Lund <[email protected]> wrote: > >> There are quite a few test classes that have multiple test methods which >> are annotated with the FlakyTest category. >> >> More thoughts: >> >> In general, I think that if any given test fails intermittently then it is >> a FlakyTest. A good test should either pass or fail consistently. After >> annotating a test method with FlakyTest, the developer should then add the >> Flaky label to corresponding Jira ticket. What we then do with the Jira >> tickets (ie, fix them) is probably more important than deciding if a test >> is flaky or not. >> >> Rather than try to come up with some flaky process for determining if a >> given test is flaky (ie, "does it have thread sleeps?"), it would be better >> to have a wiki page that has examples of flakiness and how to fix them ("if >> the test has thread sleeps, then switch to using Awaitility and do >> this..."). >> >> -Kirk >> >> >> On Mon, Apr 25, 2016 at 10:51 PM, Anthony Baker <[email protected]> wrote: >> >>> Thanks Kirk! >>> >>> ~/code/incubator-geode (develop)$ grep -ro "FlakyTest.class" . | grep -v >>> Binary | wc -l | xargs echo "Flake factor:" >>> Flake factor: 136 >>> >>> Anthony >>> >>> >>> > On Apr 25, 2016, at 9:45 PM, William Markito <[email protected]> >>> wrote: >>> > >>> > +1 >>> > >>> > Are we also planning to automate the additional build task somehow ? >>> > >>> > I'd also suggest creating a wiki page with some stats (like how many >>> > FlakyTests we currently have) and the idea behind this effort so we can >>> > keep track and see how it's evolving over time. >>> > >>> > On Mon, Apr 25, 2016 at 6:54 PM, Kirk Lund <[email protected]> wrote: >>> > >>> >> After completing GEODE-1233, all currently known flickering tests are >>> now >>> >> annotated with our FlakyTest JUnit Category. >>> >> >>> >> In an effort to divide our build up into multiple build pipelines that >>> are >>> >> sequential and dependable, we could consider excluding FlakyTests from >>> the >>> >> primary integrationTest and distributedTest tasks. An additional build >>> task >>> >> would then execute all of the FlakyTests separately. This would >>> hopefully >>> >> help us get to a point where we can depend on our primary testing tasks >>> >> staying green 100% of the time. We would then prioritize fixing the >>> >> FlakyTests and one by one removing the FlakyTest category from them. >>> >> >>> >> I would also suggest that we execute the FlakyTests with "forkEvery 1" >>> to >>> >> give each test a clean JVM or set of DistributedTest JVMs. That would >>> >> hopefully decrease the chance of a GC pause or test pollution causing >>> >> flickering failures. >>> >> >>> >> Having reviewed lots of test code and failure stacks, I believe that >>> the >>> >> primary causes of FlakyTests are timing sensitivity (thread sleeps or >>> >> nothing that waits for async activity, timeouts or sleeps that are >>> >> insufficient on busy CPU or I/O or during due GC pause) and random >>> ports >>> >> via AvailablePort (instead of using zero for ephemeral port). >>> >> >>> >> Opinions or ideas? Hate it? Love it? >>> >> >>> >> -Kirk >>> >> >>> > >>> > >>> > >>> > -- >>> > >>> > ~/William >>> >>> >>
