Re: Improving and rationalizing unit tests

Tianqi Chen Mon, 16 Oct 2017 09:04:25 -0700

I would be great if there is a chance of a few testcase to reflect these
principles, so we have a concrete discussion basis.


Having seeded random number is good, but usually it is not the cause of non
deterministic error( most of which already resolved by having a relaxed
tolerance level).

Tianqi
On Mon, Oct 16, 2017 at 9:01 AM Bhavin Thaker <[email protected]>
wrote:

> For the randomness argument, I am more concerned of a unit test that
> exhibits different behaviors for different runs. Stochastic test, IMHO, is
> not a good sanity test, because the code entry Quality bar is stochastic
> rather than deterministic — causing a lot of churn for diagnosing Unit test
> failures for PRs.
>
> There are other places (nightly) to do extensive tests.
>
> PR-unit-tests are sanity tests and must be quick, reliable and consistent
> for every PR.
>
> Bhavin Thaker.
>
> On Mon, Oct 16, 2017 at 8:51 AM Pedro Larroy <[email protected]
> >
> wrote:
>
> > That's not true. random() and similar functions are based on a PRNG. It
> > can be debugged and it's completely deterministic, a good practice is to
> > use a known seed for this.
> >
> > More info: https://en.wikipedia.org/wiki/Pseudorandom_number_generator
> >
> > On Mon, Oct 16, 2017 at 5:42 PM, pracheer gupta <
> > [email protected]> wrote:
> >
> >> @Chris: Any particular reason for -1? Randomness just prevents in
> writing
> >> tests that you can rely on and/or debug later on in case of failure.
> >>
> >> On Oct 16, 2017, at 8:28 AM, Chris Olivier <[email protected]
> <mailto:
> >> [email protected]>> wrote:
> >>
> >> -1 for "must not use random numbers for input"
> >>
> >> On Mon, Oct 16, 2017 at 7:56 AM, Bhavin Thaker <[email protected]
> >> <mailto:[email protected]>>
> >
> >
> >> wrote:
> >>
> >> I agree with Pedro.
> >>
> >> Based on various observations on unit test failures, I would like to
> >> propose a few guidelines to follow for the unit tests. Even though I use
> >> the word, “must” for my humble opinions below, please feel free to
> suggest
> >> alternatives or modifications to these guidelines:
> >>
> >> 1) 1a) Each unit test must have a run time budget <= X minutes. Say, X
> = 2
> >> minutes max.
> >> 1b) The total run time budget for all unit tests <= Y minutes. Say, Y =
> 60
> >> minutes max.
> >>
> >> 2) All Unit tests must have deterministic (not Stochastic) behavior.
> That
> >> is, instead of using the random() function to test a range of input
> >> values,
> >> each input test value must be carefully hand-picked to represent the
> >> commonly used input scenarios. The correct place to stochastically test
> >> random input values is to have continuously running nightly tests and
> NOT
> >> the sanity/smoke/unit tests for each PR.
> >>
> >> 3) All Unit tests must be as much self-contained and independent of
> >> external components as possible. For example, datasets required for the
> >> unit test must NOT be present on external website which, if unreachable,
> >> can cause test run failures. Instead, all datasets must be available
> >> locally.
> >>
> >> 4) It is impossible to test everything in unit tests and so only common
> >> use-cases and code-paths must be tested in unit-tests. Less common
> >> scenarios like integration with 3rd party products must be tested in
> >> nightly/weekly tests.
> >>
> >> 5) A unit test must NOT be disabled on a failure unless proven to
> exhibit
> >> unreliable behavior. The burden-of-proof for a test failure must be on
> the
> >> PR submitter and the PR must NOT be merged without a opening a new
> github
> >> issue explaining the problem. If the unit test is disabled for some
> >> reason,
> >> then the unit test must NOT be removed from the unit tests list; instead
> >> the unit test must be modified to add the following lines at the start
> of
> >> the test:
> >>    Print(“Unit Test DISABLED; see GitHub issue: NNNN”)
> >>    Exit(0)
> >>
> >> Please suggest modifications to the above proposal such that we can make
> >> the unit tests framework to be the rock-solid foundation for the active
> >> development of Apache MXNet (Incubating).
> >>
> >> Regards,
> >> Bhavin Thaker.
> >>
> >>
> >> On Mon, Oct 16, 2017 at 5:56 AM Pedro Larroy <
> >> [email protected]<mailto:[email protected]>
> >
> >
> >>
> >> wrote:
> >>
> >> Hi
> >>
> >> Some of the unit tests are extremely costly in terms of memory and
> >> compute.
> >>
> >> As an example in the gluon tests we are loading all the datasets.
> >>
> >> test_gluon_data.test_datasets
> >>
> >> Also running huge networks like resnets in test_gluon_model_zoo.
> >>
> >> This is ridiculously slow, and straight impossible on some embedded /
> >> memory constrained devices, and anyway is making tests run for longer
> >> than
> >> needed.
> >>
> >> Unit tests should be small, self contained, if possible pure (avoiding
> >> this
> >> kind of dataset IO if possible).
> >>
> >> I think it would be better to split them in real unit tests and extended
> >> integration test suites that do more intensive computation. This would
> >> also
> >> help with the feedback time with PRs and CI infrastructure.
> >>
> >>
> >> Thoughts?
> >>
> >>
> >>
>

Re: Improving and rationalizing unit tests

Reply via email to