+1 for separating these out and running them with forkEvery 1.

I think they should probably still run as part of precheckin and the
nightly builds though. We don't want this to turn into essentially
disabling and ignoring these tests.

-Dan

On Tue, Apr 26, 2016 at 10:28 AM, Kirk Lund <[email protected]> wrote:
> Also, I don't think there's much value continuing to use the "CI" label. If
> a test fails in Jenkins, then run the test to see if it fails consistently.
> If it doesn't, it's flaky. The developer looking at it should try to
> determine the cause of it failing (ie, "it uses thread sleeps or random
> ports with BindExceptions or has short timeouts with probable GC pause")
> and include that info when adding the FlakyTest annotation and filing a
> Jira bug with the Flaky label. If the test fails consistently, then file a
> Jira bug without the Flaky label.
>
> -Kirk
>
>
> On Tue, Apr 26, 2016 at 10:24 AM, Kirk Lund <[email protected]> wrote:
>
>> There are quite a few test classes that have multiple test methods which
>> are annotated with the FlakyTest category.
>>
>> More thoughts:
>>
>> In general, I think that if any given test fails intermittently then it is
>> a FlakyTest. A good test should either pass or fail consistently. After
>> annotating a test method with FlakyTest, the developer should then add the
>> Flaky label to corresponding Jira ticket. What we then do with the Jira
>> tickets (ie, fix them) is probably more important than deciding if a test
>> is flaky or not.
>>
>> Rather than try to come up with some flaky process for determining if a
>> given test is flaky (ie, "does it have thread sleeps?"), it would be better
>> to have a wiki page that has examples of flakiness and how to fix them ("if
>> the test has thread sleeps, then switch to using Awaitility and do
>> this...").
>>
>> -Kirk
>>
>>
>> On Mon, Apr 25, 2016 at 10:51 PM, Anthony Baker <[email protected]> wrote:
>>
>>> Thanks Kirk!
>>>
>>> ~/code/incubator-geode (develop)$ grep -ro "FlakyTest.class" . | grep -v
>>> Binary | wc -l | xargs echo "Flake factor:"
>>> Flake factor: 136
>>>
>>> Anthony
>>>
>>>
>>> > On Apr 25, 2016, at 9:45 PM, William Markito <[email protected]>
>>> wrote:
>>> >
>>> > +1
>>> >
>>> > Are we also planning to automate the additional build task somehow ?
>>> >
>>> > I'd also suggest creating a wiki page with some stats (like how many
>>> > FlakyTests we currently have) and the idea behind this effort so we can
>>> > keep track and see how it's evolving over time.
>>> >
>>> > On Mon, Apr 25, 2016 at 6:54 PM, Kirk Lund <[email protected]> wrote:
>>> >
>>> >> After completing GEODE-1233, all currently known flickering tests are
>>> now
>>> >> annotated with our FlakyTest JUnit Category.
>>> >>
>>> >> In an effort to divide our build up into multiple build pipelines that
>>> are
>>> >> sequential and dependable, we could consider excluding FlakyTests from
>>> the
>>> >> primary integrationTest and distributedTest tasks. An additional build
>>> task
>>> >> would then execute all of the FlakyTests separately. This would
>>> hopefully
>>> >> help us get to a point where we can depend on our primary testing tasks
>>> >> staying green 100% of the time. We would then prioritize fixing the
>>> >> FlakyTests and one by one removing the FlakyTest category from them.
>>> >>
>>> >> I would also suggest that we execute the FlakyTests with "forkEvery 1"
>>> to
>>> >> give each test a clean JVM or set of DistributedTest JVMs. That would
>>> >> hopefully decrease the chance of a GC pause or test pollution causing
>>> >> flickering failures.
>>> >>
>>> >> Having reviewed lots of test code and failure stacks, I believe that
>>> the
>>> >> primary causes of FlakyTests are timing sensitivity (thread sleeps or
>>> >> nothing that waits for async activity, timeouts or sleeps that are
>>> >> insufficient on busy CPU or I/O or during due GC pause) and random
>>> ports
>>> >> via AvailablePort (instead of using zero for ephemeral port).
>>> >>
>>> >> Opinions or ideas? Hate it? Love it?
>>> >>
>>> >> -Kirk
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> > ~/William
>>>
>>>
>>

Reply via email to