Re: Improving and rationalizing unit tests

Pedro Larroy Mon, 16 Oct 2017 09:35:26 -0700

It's always going to be deterministic one way or another unless you use
random from the entropy pool such as /dev/random. I don't think it's a good
practice not to seed properly and have values depend on execution order /
parallelism / time or whatever, but that's just my opinion. I would want to
use the same values for all test runs for reproducibility.


I think your argument goes more towards the previously mentioned "property
based testing" approach, which is in the spirit of what you are supporting,
if I'm not mistaken.

On Mon, Oct 16, 2017 at 6:22 PM, Chris Olivier <[email protected]>
wrote:

> My take on the suggestion of purely deterministic inputs is (including
> deterministic seeding):
>
> "I want the same values to be used for all test runs because it is
> inconvenient when a unit test fails for some edge cases.  I prefer that
> unforseen edge case failures occur in the field and not during testing".
>
> Is this the motivation?  Seems strange to me.
>
>
> On Mon, Oct 16, 2017 at 9:09 AM, Pedro Larroy <
> [email protected]>
> wrote:
>
> > I think using a properly seeded and initialized (pseudo)random is
> actually
> > beneficial (and deterministic), handpicked examples are usually too
> > simplistic and miss corner cases.
> >
> > Better yet is to use property based testing, which will pick corner cases
> > and do fuzzing automatically to check with high degree of confidence
> that a
> > testing condition holds.
> >
> > Probably it would be good if we use a property based testing library in
> > adition to nose to check invariants.
> >
> > A quick googling yields this one for python for example:
> > https://hypothesis.readthedocs.io/en/latest/quickstart.html does anyone
> > have experience or can recommend a nice property based testing library
> for
> > python?
> >
> >
> > Regards
> >
> > On Mon, Oct 16, 2017 at 4:56 PM, Bhavin Thaker <[email protected]>
> > wrote:
> >
> > > I agree with Pedro.
> > >
> > > Based on various observations on unit test failures, I would like to
> > > propose a few guidelines to follow for the unit tests. Even though I
> use
> > > the word, “must” for my humble opinions below, please feel free to
> > suggest
> > > alternatives or modifications to these guidelines:
> > >
> > > 1) 1a) Each unit test must have a run time budget <= X minutes. Say, X
> =
> > 2
> > > minutes max.
> > > 1b) The total run time budget for all unit tests <= Y minutes. Say, Y =
> > 60
> > > minutes max.
> > >
> > > 2) All Unit tests must have deterministic (not Stochastic) behavior.
> That
> > > is, instead of using the random() function to test a range of input
> > values,
> > > each input test value must be carefully hand-picked to represent the
> > > commonly used input scenarios. The correct place to stochastically test
> > > random input values is to have continuously running nightly tests and
> NOT
> > > the sanity/smoke/unit tests for each PR.
> > >
> > > 3) All Unit tests must be as much self-contained and independent of
> > > external components as possible. For example, datasets required for the
> > > unit test must NOT be present on external website which, if
> unreachable,
> > > can cause test run failures. Instead, all datasets must be available
> > > locally.
> > >
> > > 4) It is impossible to test everything in unit tests and so only common
> > > use-cases and code-paths must be tested in unit-tests. Less common
> > > scenarios like integration with 3rd party products must be tested in
> > > nightly/weekly tests.
> > >
> > > 5) A unit test must NOT be disabled on a failure unless proven to
> exhibit
> > > unreliable behavior. The burden-of-proof for a test failure must be on
> > the
> > > PR submitter and the PR must NOT be merged without a opening a new
> github
> > > issue explaining the problem. If the unit test is disabled for some
> > reason,
> > > then the unit test must NOT be removed from the unit tests list;
> instead
> > > the unit test must be modified to add the following lines at the start
> of
> > > the test:
> > >     Print(“Unit Test DISABLED; see GitHub issue: NNNN”)
> > >     Exit(0)
> > >
> > > Please suggest modifications to the above proposal such that we can
> make
> > > the unit tests framework to be the rock-solid foundation for the active
> > > development of Apache MXNet (Incubating).
> > >
> > > Regards,
> > > Bhavin Thaker.
> > >
> > >
> > > On Mon, Oct 16, 2017 at 5:56 AM Pedro Larroy <
> > [email protected]
> > > >
> > > wrote:
> > >
> > > > Hi
> > > >
> > > > Some of the unit tests are extremely costly in terms of memory and
> > > compute.
> > > >
> > > > As an example in the gluon tests we are loading all the datasets.
> > > >
> > > > test_gluon_data.test_datasets
> > > >
> > > > Also running huge networks like resnets in test_gluon_model_zoo.
> > > >
> > > > This is ridiculously slow, and straight impossible on some embedded /
> > > > memory constrained devices, and anyway is making tests run for longer
> > > than
> > > > needed.
> > > >
> > > > Unit tests should be small, self contained, if possible pure
> (avoiding
> > > this
> > > > kind of dataset IO if possible).
> > > >
> > > > I think it would be better to split them in real unit tests and
> > extended
> > > > integration test suites that do more intensive computation. This
> would
> > > also
> > > > help with the feedback time with PRs and CI infrastructure.
> > > >
> > > >
> > > > Thoughts?
> > > >
> > >
> >
>

Re: Improving and rationalizing unit tests

Reply via email to