I would be great if there is a chance of a few testcase to reflect these principles, so we have a concrete discussion basis.
Having seeded random number is good, but usually it is not the cause of non deterministic error( most of which already resolved by having a relaxed tolerance level). Tianqi On Mon, Oct 16, 2017 at 9:01 AM Bhavin Thaker <[email protected]> wrote: > For the randomness argument, I am more concerned of a unit test that > exhibits different behaviors for different runs. Stochastic test, IMHO, is > not a good sanity test, because the code entry Quality bar is stochastic > rather than deterministic — causing a lot of churn for diagnosing Unit test > failures for PRs. > > There are other places (nightly) to do extensive tests. > > PR-unit-tests are sanity tests and must be quick, reliable and consistent > for every PR. > > Bhavin Thaker. > > On Mon, Oct 16, 2017 at 8:51 AM Pedro Larroy <[email protected] > > > wrote: > > > That's not true. random() and similar functions are based on a PRNG. It > > can be debugged and it's completely deterministic, a good practice is to > > use a known seed for this. > > > > More info: https://en.wikipedia.org/wiki/Pseudorandom_number_generator > > > > On Mon, Oct 16, 2017 at 5:42 PM, pracheer gupta < > > [email protected]> wrote: > > > >> @Chris: Any particular reason for -1? Randomness just prevents in > writing > >> tests that you can rely on and/or debug later on in case of failure. > >> > >> On Oct 16, 2017, at 8:28 AM, Chris Olivier <[email protected] > <mailto: > >> [email protected]>> wrote: > >> > >> -1 for "must not use random numbers for input" > >> > >> On Mon, Oct 16, 2017 at 7:56 AM, Bhavin Thaker <[email protected] > >> <mailto:[email protected]>> > > > > > >> wrote: > >> > >> I agree with Pedro. > >> > >> Based on various observations on unit test failures, I would like to > >> propose a few guidelines to follow for the unit tests. Even though I use > >> the word, “must” for my humble opinions below, please feel free to > suggest > >> alternatives or modifications to these guidelines: > >> > >> 1) 1a) Each unit test must have a run time budget <= X minutes. Say, X > = 2 > >> minutes max. > >> 1b) The total run time budget for all unit tests <= Y minutes. Say, Y = > 60 > >> minutes max. > >> > >> 2) All Unit tests must have deterministic (not Stochastic) behavior. > That > >> is, instead of using the random() function to test a range of input > >> values, > >> each input test value must be carefully hand-picked to represent the > >> commonly used input scenarios. The correct place to stochastically test > >> random input values is to have continuously running nightly tests and > NOT > >> the sanity/smoke/unit tests for each PR. > >> > >> 3) All Unit tests must be as much self-contained and independent of > >> external components as possible. For example, datasets required for the > >> unit test must NOT be present on external website which, if unreachable, > >> can cause test run failures. Instead, all datasets must be available > >> locally. > >> > >> 4) It is impossible to test everything in unit tests and so only common > >> use-cases and code-paths must be tested in unit-tests. Less common > >> scenarios like integration with 3rd party products must be tested in > >> nightly/weekly tests. > >> > >> 5) A unit test must NOT be disabled on a failure unless proven to > exhibit > >> unreliable behavior. The burden-of-proof for a test failure must be on > the > >> PR submitter and the PR must NOT be merged without a opening a new > github > >> issue explaining the problem. If the unit test is disabled for some > >> reason, > >> then the unit test must NOT be removed from the unit tests list; instead > >> the unit test must be modified to add the following lines at the start > of > >> the test: > >> Print(“Unit Test DISABLED; see GitHub issue: NNNN”) > >> Exit(0) > >> > >> Please suggest modifications to the above proposal such that we can make > >> the unit tests framework to be the rock-solid foundation for the active > >> development of Apache MXNet (Incubating). > >> > >> Regards, > >> Bhavin Thaker. > >> > >> > >> On Mon, Oct 16, 2017 at 5:56 AM Pedro Larroy < > >> [email protected]<mailto:[email protected]> > > > > > >> > >> wrote: > >> > >> Hi > >> > >> Some of the unit tests are extremely costly in terms of memory and > >> compute. > >> > >> As an example in the gluon tests we are loading all the datasets. > >> > >> test_gluon_data.test_datasets > >> > >> Also running huge networks like resnets in test_gluon_model_zoo. > >> > >> This is ridiculously slow, and straight impossible on some embedded / > >> memory constrained devices, and anyway is making tests run for longer > >> than > >> needed. > >> > >> Unit tests should be small, self contained, if possible pure (avoiding > >> this > >> kind of dataset IO if possible). > >> > >> I think it would be better to split them in real unit tests and extended > >> integration test suites that do more intensive computation. This would > >> also > >> help with the feedback time with PRs and CI infrastructure. > >> > >> > >> Thoughts? > >> > >> > >> >
