Adding the T&T group to this thread as they will be making progress in step with CI towards conquering flaky tests :)
~J On Mon, Nov 4, 2013 at 1:36 PM, Paul Larson <[email protected]>wrote: > Here's a list I started, with a brief rationale for each: > > https://docs.google.com/a/canonical.com/spreadsheet/ccc?key=0AjwxZmhDIclsdFdiV2c3dFhGYWpfckhqT1N1ZWpUY1E&usp=sharing > > Certainly things could change per image here, this is just to give us an > easy way to make a list. > > > On Mon, Nov 4, 2013 at 10:53 AM, Francis Ginther < > [email protected]> wrote: > >> A possible, first step solution would be to add a very simple retry to >> our CI test runner scripts and if the retry passes, the test passes. >> The harder part would be to identify those test cases which failed on >> the original attempt and making those visible. This could be done by >> mining the jenkins data itself and could probably be made smart enough >> to file bugs when it finds tests that passed the retry, although it's >> hard to determine why the tests failed (i.e. maybe it's a flaky test >> or maybe unity8 crashed, etc.). >> >> Francis >> >> On Mon, Nov 4, 2013 at 3:43 AM, Vincent Ladeuil <[email protected]> >> wrote: >> >>>>>> Evan Dandrea <[email protected]> writes: >> > >> > > Can someone provide the set of tests we're running that are flaky? >> > >> > There is a whole spectrum of flaky tests, and identifying them is an >> > art. That's the theory ;) >> > >> > In practice, there are several ways to automatically identify flake >> > tests. >> > >> > Each failing test could be retried and considered flaky only if it fails >> > again in the same context. >> > >> > Ha, ouch, "same context" is darn hard to guarantee, what if the test is >> > flaky only when some other tests run before it ? >> > >> > What if a flaky test always succeeds when run alone ? >> > >> > What if the flakiness is caused by a specific hardware (and by that I >> > don't mean a brand or a product but a unique piece of hardware) ? >> > >> > >> > > If we don't have a good way of getting this list, could we wire up >> > > job retries and log when that occurs? >> > >> > While I've encountered the above ones in real life, those are not the >> > majority, so re-trying the test itself only requires a specific test >> > runner (having all projects use such a test runner is achievable, >> > slowly, but who will refuse to be protected against flaky tests). We can >> > log such occurrences but the log will be disconnected from the test >> > results. >> > >> > Alternatively, a job can be re-run with only the failing tests and it >> > becomes slightly easiest to both report and process the flaky >> > tests. This doesn't require a specific test runner but a subunit test >> > result ;) And a specific test loader to select only the failing tests >> > from a previous run[1]. >> > >> > Even jenkins should be able to display such results as long as we tweak >> > the flaky test names (don't ever try to display the details of a test >> > failure in jenkins if the test you're interested in is named like a >> > previous failure, you'll always get the later ;). And even the dashboard >> > could be taught to display green(pass)/yellow(flaky)/red(errors) instead >> > of just a %. >> > >> > Note that unique test names, while not enforced by python's unittest, is >> > a simple constraint we want for multiple reasons, the jenkins one >> > mentioned above ; this is not a bug a far as I'm concerned, there is >> > just no solution to ambiguous names, what if I tell you: 'test X' is >> > failing and you're looking at the code of test X (the other one) ? A >> > unique name is also needed when you want to select a single test to run. >> > >> > Now, there are a few risks associated with automatically re-trying flaky >> > tests: >> > >> > - people will care less about them (in the same way any dev start >> > ignoring warnings when too many happen making it impossible to notice >> > the new ones), >> > >> > - the root cause of the flakiness can be in the ci engine, the test >> > infrastructure or the code itself. The right people should be involved >> > to fix them and as of today, all causes are inter-mixed so nobody has >> > a clear view of what *he* should fix. >> > >> > So there is more than automatic retries to get rid of the flaky tests, >> > people should be involved to track and fix them. >> > >> > Vincent >> > >> > > Thanks! >> > >> > > On 31 October 2013 11:23, Julien Funk <[email protected]> >> wrote: >> > >> so, it would be great if we coul get a list of flaky tests re: >> discussion >> > >> today. I don't mind rerunning the tests automatically when they >> fail, but I >> > >> think action should be mandatory on any flaky tests we discover >> and we >> > >> should maintain a list of them somewhere with a process to deal >> > >> with them. >> > >> > Yeah, that's the social part ;) At least as important and probably far >> > more than the technical part ;) >> > >> > Vincent >> > >> > >> > [1]: Test loaders, filters, results and runners already exist in >> > lp:selenium-simple-test (and lp:subunit of course). Teaching the test >> > runner to re-run failing tests could be added easily. >> > >> > -- >> > Mailing list: https://launchpad.net/~canonical-ci-engineering >> > Post to : [email protected] >> > Unsubscribe : https://launchpad.net/~canonical-ci-engineering >> > More help : https://help.launchpad.net/ListHelp >> >> >> >> -- >> Francis Ginther >> Canonical - Ubuntu Engineering - Continuous Integration Team >> >> -- >> Mailing list: https://launchpad.net/~canonical-ci-engineering >> Post to : [email protected] >> Unsubscribe : https://launchpad.net/~canonical-ci-engineering >> More help : https://help.launchpad.net/ListHelp >> > >
-- Mailing list: https://launchpad.net/~canonical-ci-engineering Post to : [email protected] Unsubscribe : https://launchpad.net/~canonical-ci-engineering More help : https://help.launchpad.net/ListHelp

