On Fri, May 4, 2018 at 10:20 AM, Ben Coman <[email protected]> wrote:
> > > On 3 May 2018 at 23:35, Guillermo Polito <[email protected]> > wrote: > >> Hi all, >> >> Initially the pharo build had tests that were failing randomly. >> To cope with that, we introduced a retry of the tests. >> >> Nowadays, this situation is actually very rare. Tests that fail, fail >> always, and randomly failing tests are not seen so often... This means >> however, that in the case that a test is persistently failing, we are >> (uselessly) retrying it, and making jobs take 10-15 minutes longer for >> nothing. >> > > I'm curious... > are all tests retried, or only the failing one? > All of them. > > >> I propose that we remove the retries. >> >> - This will speed up the builds that are green and only penalize those >> that are not green. >> > > If all tests pass first time there should be no retries and such a green > build > should take the same time regardless whether retries are enabled or > not.... ? > True, I don't know what I wrote there ^^. What I meant is that builds that are not green will fail sooner. > > >> - Remove stress from our servers (that we use to have a higher ratio >> builds/hour :)) >> - Randomly failing tests will just need to manually retry the build. But >> since green builds take now ~15-20 minutes, which is in the same order of >> magnitude of the retries, we only penalize the one that found the hiccup. >> > > The "really annoying" random failures are just single failed test. > Perhaps assume if the count of failed tests is more than ten, then its not > a "random" failure and immediately fail that job. > But how can we distinguish between a real failing test and one that is random? > Rerunning max ten tests shouldn't add much to job time. > But maybe the benefit isn't worth the added complexity to do it like that. > The thing is also that we have to implement something custom for that. And I'd like to put my effort on other things that add more value in the short term... > > cheers -ben > -- Guille Polito Research Engineer Centre de Recherche en Informatique, Signal et Automatique de Lille CRIStAL - UMR 9189 French National Center for Scientific Research - *http://www.cnrs.fr <http://www.cnrs.fr>* *Web:* *http://guillep.github.io* <http://guillep.github.io> *Phone: *+33 06 52 70 66 13
