Also, I don't think there's much value continuing to use the "CI" label. If a test fails in Jenkins, then run the test to see if it fails consistently. If it doesn't, it's flaky. The developer looking at it should try to determine the cause of it failing (ie, "it uses thread sleeps or random ports with BindExceptions or has short timeouts with probable GC pause") and include that info when adding the FlakyTest annotation and filing a Jira bug with the Flaky label. If the test fails consistently, then file a Jira bug without the Flaky label.
-Kirk On Tue, Apr 26, 2016 at 10:24 AM, Kirk Lund <[email protected]> wrote: > There are quite a few test classes that have multiple test methods which > are annotated with the FlakyTest category. > > More thoughts: > > In general, I think that if any given test fails intermittently then it is > a FlakyTest. A good test should either pass or fail consistently. After > annotating a test method with FlakyTest, the developer should then add the > Flaky label to corresponding Jira ticket. What we then do with the Jira > tickets (ie, fix them) is probably more important than deciding if a test > is flaky or not. > > Rather than try to come up with some flaky process for determining if a > given test is flaky (ie, "does it have thread sleeps?"), it would be better > to have a wiki page that has examples of flakiness and how to fix them ("if > the test has thread sleeps, then switch to using Awaitility and do > this..."). > > -Kirk > > > On Mon, Apr 25, 2016 at 10:51 PM, Anthony Baker <[email protected]> wrote: > >> Thanks Kirk! >> >> ~/code/incubator-geode (develop)$ grep -ro "FlakyTest.class" . | grep -v >> Binary | wc -l | xargs echo "Flake factor:" >> Flake factor: 136 >> >> Anthony >> >> >> > On Apr 25, 2016, at 9:45 PM, William Markito <[email protected]> >> wrote: >> > >> > +1 >> > >> > Are we also planning to automate the additional build task somehow ? >> > >> > I'd also suggest creating a wiki page with some stats (like how many >> > FlakyTests we currently have) and the idea behind this effort so we can >> > keep track and see how it's evolving over time. >> > >> > On Mon, Apr 25, 2016 at 6:54 PM, Kirk Lund <[email protected]> wrote: >> > >> >> After completing GEODE-1233, all currently known flickering tests are >> now >> >> annotated with our FlakyTest JUnit Category. >> >> >> >> In an effort to divide our build up into multiple build pipelines that >> are >> >> sequential and dependable, we could consider excluding FlakyTests from >> the >> >> primary integrationTest and distributedTest tasks. An additional build >> task >> >> would then execute all of the FlakyTests separately. This would >> hopefully >> >> help us get to a point where we can depend on our primary testing tasks >> >> staying green 100% of the time. We would then prioritize fixing the >> >> FlakyTests and one by one removing the FlakyTest category from them. >> >> >> >> I would also suggest that we execute the FlakyTests with "forkEvery 1" >> to >> >> give each test a clean JVM or set of DistributedTest JVMs. That would >> >> hopefully decrease the chance of a GC pause or test pollution causing >> >> flickering failures. >> >> >> >> Having reviewed lots of test code and failure stacks, I believe that >> the >> >> primary causes of FlakyTests are timing sensitivity (thread sleeps or >> >> nothing that waits for async activity, timeouts or sleeps that are >> >> insufficient on busy CPU or I/O or during due GC pause) and random >> ports >> >> via AvailablePort (instead of using zero for ephemeral port). >> >> >> >> Opinions or ideas? Hate it? Love it? >> >> >> >> -Kirk >> >> >> > >> > >> > >> > -- >> > >> > ~/William >> >> >
