Agree with Bhavin's arguments 100%. Please don't compromise the stability of CI with Flaky tests. Address the root cause of why these tests are failing / not deterministic as per propper engineering standards.
Hence, my non-binding vote is: -1 for proposal #1 for re-enabling flaky tests. +1 for proposal #2 for setting the standard for adding reliable tests. Pedro. On Sun, Jan 14, 2018 at 8:00 PM, Bhavin Thaker <[email protected]> wrote: > Hi Sheng, > > Thank you for your efforts and this proposal to improve the tests. Here are > my thoughts. > > Shouldn’t the focus be to _engineer_ each test to be reliable instead of > compromising and discussing the relative tradeoffs in re-enabling flaky > tests? Is the test failure probability really 10%? > > As you correctly mention, the experiences in making the tests reliable will > then serve as the standard for adding new tests rather than continuing to > chase the elusive goal of reliable tests. > > Hence, my non-binding vote is: > -1 for proposal #1 for renabling flaky tests. > +1 for proposal #2 for setting the standard for adding reliable tests. > > I suggest to NOT compromise on the quality and reliability of the tests, > similar to the high bar maintained for the MXNet source code. > > If the final vote is to re-enable flaky tests, then I propose that we > enable them immediately AFTER the next MXNet release instead of doing it > during the upcoming release. > > Bhavin Thaker. > > On Sat, Jan 13, 2018 at 2:20 PM, Marco de Abreu < > [email protected]> wrote: > >> Hello Sheng, >> >> thanks a lot for leading this task! >> >> +1 for both points. Additionally, I'd propose to add the requirement to >> specify a reason if a new test takes more than X seconds (say 10) or adds >> an external dependency. >> >> Looking forward to getting these tests fixed :) >> >> Best regards, >> Marco >> >> On Sat, Jan 13, 2018 at 11:14 PM, Sheng Zha <[email protected]> wrote: >> >> > Hi MXNet community, >> > >> > Thanks to the efforts of several community members, we identified many >> > flaky tests. These tests are currently disabled to ensure the smooth >> > execution of continuous integration (CI). As a result, we lost coverage >> on >> > those features. They need fixing and to be re-enabled to ensure the >> quality >> > of our releases. I'd like to propose the following: >> > >> > 1, Re-enable flaky python tests with retries if feasible >> > Although the tests are unstable, they would still be able to catch >> breaking >> > changes. For example, suppose a test fails randomly with 10% probability, >> > the probability of three failed retries become 0.1%. On the other hand, a >> > breaking change would result in 100% failure. Although this could >> increase >> > the testing time, it's a compromise that can help avoid bigger problem. >> > >> > 2, Set standard for new tests >> > I think having criteria that new tests should follow can help improve the >> > quality of tests, but also the quality of code. I propose the following >> > standard for tests. >> > - Reliably passing with good coverage >> > - Avoid randomness unless necessary >> > - Avoid external dependency unless necessary (e.g. due to license) >> > - Not resource-intensive unless necessary (e.g. scaling tests) >> > >> > In addition, I'd like to call for volunteers on helping with the fix of >> > tests. New members are especially welcome, as it's a good opportunity to >> > familiarize with MXNet. Also, I'd like to request that members who wrote >> > the feature/test could help either by fixing, or by helping others >> > understand the issues. >> > >> > The effort on fixing the tests is tracked at: >> > https://github.com/apache/incubator-mxnet/issues/9412 >> > >> > Best regards, >> > Sheng >> > >>
