Re: Improving and rationalizing unit tests

Bhavin Thaker Mon, 16 Oct 2017 09:01:45 -0700

For the randomness argument, I am more concerned of a unit test that
exhibits different behaviors for different runs. Stochastic test, IMHO, is
not a good sanity test, because the code entry Quality bar is stochastic
rather than deterministic — causing a lot of churn for diagnosing Unit test
failures for PRs.


There are other places (nightly) to do extensive tests.

PR-unit-tests are sanity tests and must be quick, reliable and consistent
for every PR.

Bhavin Thaker.

On Mon, Oct 16, 2017 at 8:51 AM Pedro Larroy <[email protected]>
wrote:

> That's not true. random() and similar functions are based on a PRNG. It
> can be debugged and it's completely deterministic, a good practice is to
> use a known seed for this.
>
> More info: https://en.wikipedia.org/wiki/Pseudorandom_number_generator
>
> On Mon, Oct 16, 2017 at 5:42 PM, pracheer gupta <
> [email protected]> wrote:
>
>> @Chris: Any particular reason for -1? Randomness just prevents in writing
>> tests that you can rely on and/or debug later on in case of failure.
>>
>> On Oct 16, 2017, at 8:28 AM, Chris Olivier <[email protected]<mailto:
>> [email protected]>> wrote:
>>
>> -1 for "must not use random numbers for input"
>>
>> On Mon, Oct 16, 2017 at 7:56 AM, Bhavin Thaker <[email protected]
>> <mailto:[email protected]>>
>
>
>> wrote:
>>
>> I agree with Pedro.
>>
>> Based on various observations on unit test failures, I would like to
>> propose a few guidelines to follow for the unit tests. Even though I use
>> the word, “must” for my humble opinions below, please feel free to suggest
>> alternatives or modifications to these guidelines:
>>
>> 1) 1a) Each unit test must have a run time budget <= X minutes. Say, X = 2
>> minutes max.
>> 1b) The total run time budget for all unit tests <= Y minutes. Say, Y = 60
>> minutes max.
>>
>> 2) All Unit tests must have deterministic (not Stochastic) behavior. That
>> is, instead of using the random() function to test a range of input
>> values,
>> each input test value must be carefully hand-picked to represent the
>> commonly used input scenarios. The correct place to stochastically test
>> random input values is to have continuously running nightly tests and NOT
>> the sanity/smoke/unit tests for each PR.
>>
>> 3) All Unit tests must be as much self-contained and independent of
>> external components as possible. For example, datasets required for the
>> unit test must NOT be present on external website which, if unreachable,
>> can cause test run failures. Instead, all datasets must be available
>> locally.
>>
>> 4) It is impossible to test everything in unit tests and so only common
>> use-cases and code-paths must be tested in unit-tests. Less common
>> scenarios like integration with 3rd party products must be tested in
>> nightly/weekly tests.
>>
>> 5) A unit test must NOT be disabled on a failure unless proven to exhibit
>> unreliable behavior. The burden-of-proof for a test failure must be on the
>> PR submitter and the PR must NOT be merged without a opening a new github
>> issue explaining the problem. If the unit test is disabled for some
>> reason,
>> then the unit test must NOT be removed from the unit tests list; instead
>> the unit test must be modified to add the following lines at the start of
>> the test:
>>    Print(“Unit Test DISABLED; see GitHub issue: NNNN”)
>>    Exit(0)
>>
>> Please suggest modifications to the above proposal such that we can make
>> the unit tests framework to be the rock-solid foundation for the active
>> development of Apache MXNet (Incubating).
>>
>> Regards,
>> Bhavin Thaker.
>>
>>
>> On Mon, Oct 16, 2017 at 5:56 AM Pedro Larroy <
>> [email protected]<mailto:[email protected]>
>
>
>>
>> wrote:
>>
>> Hi
>>
>> Some of the unit tests are extremely costly in terms of memory and
>> compute.
>>
>> As an example in the gluon tests we are loading all the datasets.
>>
>> test_gluon_data.test_datasets
>>
>> Also running huge networks like resnets in test_gluon_model_zoo.
>>
>> This is ridiculously slow, and straight impossible on some embedded /
>> memory constrained devices, and anyway is making tests run for longer
>> than
>> needed.
>>
>> Unit tests should be small, self contained, if possible pure (avoiding
>> this
>> kind of dataset IO if possible).
>>
>> I think it would be better to split them in real unit tests and extended
>> integration test suites that do more intensive computation. This would
>> also
>> help with the feedback time with PRs and CI infrastructure.
>>
>>
>> Thoughts?
>>
>>
>>

Re: Improving and rationalizing unit tests

Reply via email to