Re: Improving and rationalizing unit tests

2017-10-16 Thread Dick Carter
This thread is very timely, as it relates to issues I've been trying to solve with my coding of the last week. I think it simplest just to present the code and detailed PR description as the clearest form of my thinking. Happy to discuss further. Please see:

Re: Improving and rationalizing unit tests

2017-10-16 Thread Dick Carter
This thread is very timely, as it relates to issues I've been trying to solve with my coding of the last week.  I think it simplest just to present the code (and detailed PR description) as the clearest form of my thinking.  Happy to discuss further.  Please see:

Re: Improving and rationalizing unit tests

2017-10-16 Thread Zha, Sheng
t: Re: Improving and rationalizing unit tests My argument is that I am actually categorically against having a requirement that the same input values be used for testing for every run. I don't personally view "convenience in reproducing" as outweighing "finding

RE: Improving and rationalizing unit tests

2017-10-16 Thread kellen sunderland
in source control. -Kellen From: Chris Olivier Sent: Monday, October 16, 2017 6:46 PM To: dev@mxnet.incubator.apache.org Subject: Re: Improving and rationalizing unit tests My argument is that I am actually categorically against having a requirement that the same input values be used for testing

Re: Improving and rationalizing unit tests

2017-10-16 Thread Chris Olivier
My argument is that I am actually categorically against having a requirement that the same input values be used for testing for every run. I don't personally view "convenience in reproducing" as outweighing "finding edge cases that I didn't think of or that haven't been tried before". On Mon,

Re: Improving and rationalizing unit tests

2017-10-16 Thread Pedro Larroy
It's always going to be deterministic one way or another unless you use random from the entropy pool such as /dev/random. I don't think it's a good practice not to seed properly and have values depend on execution order / parallelism / time or whatever, but that's just my opinion. I would want to

Re: Improving and rationalizing unit tests

2017-10-16 Thread Chris Olivier
My take on the suggestion of purely deterministic inputs is (including deterministic seeding): "I want the same values to be used for all test runs because it is inconvenient when a unit test fails for some edge cases. I prefer that unforseen edge case failures occur in the field and not during

Re: Improving and rationalizing unit tests

2017-10-16 Thread Chris Olivier
I'd like to respectfully dispute the assumption that it's hard to debug with random values: If a test is failing with any sort of frequency, it's easy to come up with offending values by running the test in a loop for 1 times or so. I did this just yesterday to prove a test case was also

Re: Improving and rationalizing unit tests

2017-10-16 Thread Tianqi Chen
I would be great if there is a chance of a few testcase to reflect these principles, so we have a concrete discussion basis. Having seeded random number is good, but usually it is not the cause of non deterministic error( most of which already resolved by having a relaxed tolerance level).

Re: Improving and rationalizing unit tests

2017-10-16 Thread Bhavin Thaker
For the randomness argument, I am more concerned of a unit test that exhibits different behaviors for different runs. Stochastic test, IMHO, is not a good sanity test, because the code entry Quality bar is stochastic rather than deterministic — causing a lot of churn for diagnosing Unit test

Re: Improving and rationalizing unit tests

2017-10-16 Thread pracheer gupta
That’s true Pedro. I assumed, in this particular context, when we say “random” numbers we mean random numbers which have not been explicitly seeded which make the intermittently failing unit tests hard to reproduce. On Oct 16, 2017, at 8:51 AM, Pedro Larroy

Re: Improving and rationalizing unit tests

2017-10-16 Thread Pedro Larroy
That's not true. random() and similar functions are based on a PRNG. It can be debugged and it's completely deterministic, a good practice is to use a known seed for this. More info: https://en.wikipedia.org/wiki/Pseudorandom_number_generator On Mon, Oct 16, 2017 at 5:42 PM, pracheer gupta

Re: Improving and rationalizing unit tests

2017-10-16 Thread pracheer gupta
@Chris: Any particular reason for -1? Randomness just prevents in writing tests that you can rely on and/or debug later on in case of failure. On Oct 16, 2017, at 8:28 AM, Chris Olivier > wrote: -1 for "must not use random numbers for input"

Re: Improving and rationalizing unit tests

2017-10-16 Thread Chris Olivier
-1 for "must not use random numbers for input" On Mon, Oct 16, 2017 at 7:56 AM, Bhavin Thaker wrote: > I agree with Pedro. > > Based on various observations on unit test failures, I would like to > propose a few guidelines to follow for the unit tests. Even though I use

Re: Improving and rationalizing unit tests

2017-10-16 Thread Bhavin Thaker
I agree with Pedro. Based on various observations on unit test failures, I would like to propose a few guidelines to follow for the unit tests. Even though I use the word, “must” for my humble opinions below, please feel free to suggest alternatives or modifications to these guidelines: 1) 1a)

Improving and rationalizing unit tests

2017-10-16 Thread Pedro Larroy
Hi Some of the unit tests are extremely costly in terms of memory and compute. As an example in the gluon tests we are loading all the datasets. test_gluon_data.test_datasets Also running huge networks like resnets in test_gluon_model_zoo. This is ridiculously slow, and straight impossible on