My take on the suggestion of purely deterministic inputs is (including deterministic seeding):
"I want the same values to be used for all test runs because it is inconvenient when a unit test fails for some edge cases. I prefer that unforseen edge case failures occur in the field and not during testing". Is this the motivation? Seems strange to me. On Mon, Oct 16, 2017 at 9:09 AM, Pedro Larroy <[email protected]> wrote: > I think using a properly seeded and initialized (pseudo)random is actually > beneficial (and deterministic), handpicked examples are usually too > simplistic and miss corner cases. > > Better yet is to use property based testing, which will pick corner cases > and do fuzzing automatically to check with high degree of confidence that a > testing condition holds. > > Probably it would be good if we use a property based testing library in > adition to nose to check invariants. > > A quick googling yields this one for python for example: > https://hypothesis.readthedocs.io/en/latest/quickstart.html does anyone > have experience or can recommend a nice property based testing library for > python? > > > Regards > > On Mon, Oct 16, 2017 at 4:56 PM, Bhavin Thaker <[email protected]> > wrote: > > > I agree with Pedro. > > > > Based on various observations on unit test failures, I would like to > > propose a few guidelines to follow for the unit tests. Even though I use > > the word, “must” for my humble opinions below, please feel free to > suggest > > alternatives or modifications to these guidelines: > > > > 1) 1a) Each unit test must have a run time budget <= X minutes. Say, X = > 2 > > minutes max. > > 1b) The total run time budget for all unit tests <= Y minutes. Say, Y = > 60 > > minutes max. > > > > 2) All Unit tests must have deterministic (not Stochastic) behavior. That > > is, instead of using the random() function to test a range of input > values, > > each input test value must be carefully hand-picked to represent the > > commonly used input scenarios. The correct place to stochastically test > > random input values is to have continuously running nightly tests and NOT > > the sanity/smoke/unit tests for each PR. > > > > 3) All Unit tests must be as much self-contained and independent of > > external components as possible. For example, datasets required for the > > unit test must NOT be present on external website which, if unreachable, > > can cause test run failures. Instead, all datasets must be available > > locally. > > > > 4) It is impossible to test everything in unit tests and so only common > > use-cases and code-paths must be tested in unit-tests. Less common > > scenarios like integration with 3rd party products must be tested in > > nightly/weekly tests. > > > > 5) A unit test must NOT be disabled on a failure unless proven to exhibit > > unreliable behavior. The burden-of-proof for a test failure must be on > the > > PR submitter and the PR must NOT be merged without a opening a new github > > issue explaining the problem. If the unit test is disabled for some > reason, > > then the unit test must NOT be removed from the unit tests list; instead > > the unit test must be modified to add the following lines at the start of > > the test: > > Print(“Unit Test DISABLED; see GitHub issue: NNNN”) > > Exit(0) > > > > Please suggest modifications to the above proposal such that we can make > > the unit tests framework to be the rock-solid foundation for the active > > development of Apache MXNet (Incubating). > > > > Regards, > > Bhavin Thaker. > > > > > > On Mon, Oct 16, 2017 at 5:56 AM Pedro Larroy < > [email protected] > > > > > wrote: > > > > > Hi > > > > > > Some of the unit tests are extremely costly in terms of memory and > > compute. > > > > > > As an example in the gluon tests we are loading all the datasets. > > > > > > test_gluon_data.test_datasets > > > > > > Also running huge networks like resnets in test_gluon_model_zoo. > > > > > > This is ridiculously slow, and straight impossible on some embedded / > > > memory constrained devices, and anyway is making tests run for longer > > than > > > needed. > > > > > > Unit tests should be small, self contained, if possible pure (avoiding > > this > > > kind of dataset IO if possible). > > > > > > I think it would be better to split them in real unit tests and > extended > > > integration test suites that do more intensive computation. This would > > also > > > help with the feedback time with PRs and CI infrastructure. > > > > > > > > > Thoughts? > > > > > >
