Re: [Discuss] writing unit tests for scientific software

Terri Yu Tue, 18 Jul 2017 17:28:00 -0700

Hi all,

I realized that I have been doing a lot of the right things already,
according to Titus and others.  In retrospect, the reason I started this
thread was that I couldn't find much writing on this topic and wanted some
reassurance that I was on the right track.

I thought it might be useful to share what I've already been doing in my
unit tests.

1) I use the Python package "coverage".  To make things easier, I wrote a
bash script that measures coverage for the unit tests and displays the
results for the relevant source code.  I added a couple tests based on
missing coverage.

2) My software uses a command line interface based on the Python standard
library argparse.  I have a lot of unit tests that simply checking input
and output from running commands.  A lot of it is simply checking for bad /
invalid arguments.,  Or checking output for stuff like, if I asked for 4
harmonics, I should get 4 columns of harmonic numbers

3) Since the software I'm writing is a Python port of Matlab scripts, I
check numerical results from my Python code against the output from the
Matlab scripts.  I extended the unittest TestCase class with a specialized
assert function that uses NumPy's allclose() function to compare floating
point results.  I couldn't use the NumPy testing module directly because
NumPy uses nose as their test framework, whereas I'm using unittest.

4) I do check a few numbers output from algorithms in my unit tests and I
was wondering if there was any point in that, but the slide in Hans's
presentation makes the good point that it's useful for checking earlier
versions of the code against itself.  Also, since I want the software to be
cross-platform on Windows/Mac/Linux, it's also a good way to check that all
platforms are giving consistent results.

5) I also have a few unit tests where the parameters to an algorithm are
changed from the default and the numbers are checked.  This is useful for
showing that the numbers change when the parameters change, though there is
no assurance that the numbers are actually "correct".  I wish I had some
beautiful example with a known analytical result like the 2D Ising model,
but I don't think I have any cases like that for the algorithms I'm using.

6) I have a couple helper functions that have simple exact answers, so I
can write unit tests to check for accurate results.  For example, I have a
helper function that rounds numbers according to the "half away from zero"
method.  Sort of like checking "2+2 = 4" as Titus mentioned.

6) Anywhere that I throw an exception in the source code, I try to make
sure there's a matching unit test that purposely tries to trigger the
exception and checks that the correct exception was raised.

Thanks everyone for your help, and hope everyone is learning useful things
from this thread.

Terri

On Tue, Jul 18, 2017 at 9:38 AM, Paul Wilson <[email protected]> wrote:

> Hi Terri,
>
> I'll briefly add that testing is asymptotic (as suggested by Titus below),
> so it may be difficult to have "every" test.  We rely on code review to
> help identify missing tests, particularly for new code - but also for older
> code.
>
> Paul
>
>
>
> On 07/17/2017 11:11 AM, C. Titus Brown wrote:
>
>> Hi Terri,
>>
>> I think lots of people in the scientific Python community write their
>> own algorithms and test them.  But it's hard to give generic advice here,
>> I think, because it's so dependent on your algorithm.
>>
>> Here's my try / our approach that has worked well for us over the last
>> decade or so.
>>
>> * first, write automated "smoke" tests that check to see if your code is
>>    basically running/working.  They should be as dumb and robust as
>> possible.
>>    (e.g. the equivalent of "check that 2+2 = 4").
>>
>>    These are by far the most important in my experience, in that they
>> deliver
>>    the most value for the least effort.
>>
>> * set up CI on those tests.
>>
>> * check code coverage of your code base, and try to get it to 30-40%
>>    by testing the basic code paths.
>>
>> * write a series of basic tests for edge cases (divide by zero, boundary
>>    conditions, that kind of thing), trying to cover another 10-20%.
>>
>> * as your code base matures and complexifies, write tests for new
>> functionality
>>    and try to cover old functionality as well.  Here code coverage is your
>>    friend in terms of targeting effort.
>>
>> * whenever you discover a bug, write a test against that bug before
>> fixing it.
>>    That way your most error prone bits will get more coverage adaptively.
>>    I call this "stupidity driven testing."
>>
>> Lather, rinse, repeat.
>>
>> tl; dr? smoke tests, code coverage analysis, test against buggy code.
>>
>> best,
>> --titus
>>
>> On Mon, Jul 17, 2017 at 11:50:59AM -0400, Terri Yu wrote:
>>
>>> Thanks everyone, those are interesting resources for testing in general.
>>>
>>> I'm using Python's unittest framework and everything is already set up.
>>> The specific problem I need help with is what tests to write, in order to
>>> test numerical floating point output from algorithms.  Given the
>>> responses
>>> I've gotten, it seems like not many people write their own algorithms
>>> and/or test them.
>>>
>>> Terri
>>>
>>> On Sun, Jul 16, 2017 at 5:50 PM, Jeremy Gray <[email protected]>
>>> wrote:
>>>
>>> Hi Terri,
>>>>
>>>>
>>>> It might also be worth checking out the workshop from this years pycon
>>>> from Eria ma:
>>>> Best Testing Practices for Data Science, on yotube here -
>>>> https://www.youtube.com/watch?v=yACtdj1_IxE
>>>>
>>>> The github repo is here:
>>>> https://github.com/ericmjl/data-testing-tutorial
>>>>
>>>> Cheers,
>>>> Jeremy
>>>>
>>>> On Fri, Jul 14, 2017 at 5:21 PM, Olav Vahtras <[email protected]>
>>>> wrote:
>>>>
>>>> Dear Terri
>>>>>
>>>>> In addition I can recommend the following resource:
>>>>>
>>>>> pythontesting.net has a podcast series on testing and more, check out
>>>>> the new book on pytest by the site maintainer Brian Okken
>>>>>
>>>>> Regards
>>>>> Olav
>>>>>
>>>>>
>>>>>
>>>>> Olav
>>>>>
>>>>>> 14 juli 2017 kl. 21:36 skrev Ashwin Srinath <[email protected]>:
>>>>>>
>>>>>> If you're using Python, numpy.testing has the tools you'll need:
>>>>>>
>>>>>> https://docs.scipy.org/doc/numpy/reference/routines.testing.html
>>>>>>
>>>>>> There's also pandas.testing for testing code that uses Pandas.
>>>>>>
>>>>>> Thanks,
>>>>>> Ashwin
>>>>>>
>>>>>> On Fri, Jul 14, 2017 at 3:27 PM, Terri Yu <[email protected]> wrote:
>>>>>>> Hi everyone,
>>>>>>>
>>>>>>> Are there any resources that explain how to write unit tests for
>>>>>>>
>>>>>> scientific
>>>>>
>>>>>> software?  I'm writing some software that processes audio signals and
>>>>>>>
>>>>>> there
>>>>>
>>>>>> are many parameters.  I'm wondering what's the best way to test
>>>>>>>
>>>>>> floating
>>>>>
>>>>>> point numeric results.
>>>>>>>
>>>>>>> Do I need to test every single parameter?  How can I verify accuracy
>>>>>>> of
>>>>>>> numeric results... use a different language / library?  I would like
>>>>>>>
>>>>>> to do a
>>>>>
>>>>>> good job of testing, but I also don't want to write a bunch of
>>>>>>>
>>>>>> semi-useless
>>>>>
>>>>>> tests that take a long time to run.
>>>>>>>
>>>>>>> I would appreciate any thoughts you have.
>>>>>>>
>>>>>>> Thank you,
>>>>>>>
>>>>>>> Terri
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Discuss mailing list
>>>>>>> [email protected]
>>>>>>> http://lists.software-carpentry.org/listinfo/discuss
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Discuss mailing list
>>>>>> [email protected]
>>>>>> http://lists.software-carpentry.org/listinfo/discuss
>>>>>>
>>>>> _______________________________________________
>>>>> Discuss mailing list
>>>>> [email protected]
>>>>> http://lists.software-carpentry.org/listinfo/discuss
>>>>>
>>>>>
>>>> _______________________________________________
>>> Discuss mailing list
>>> [email protected]
>>> http://lists.software-carpentry.org/listinfo/discuss
>>>
>>
> --
> -- ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ --
>
> Paul P.H. Wilson
> Grainger Professor of Nuclear Engineering
> 608-263-0807
> [email protected]
> 443 Engineering Research Bldg
> 1500 Engineering Dr, Madison, WI 53706
> calendar: http://go.wisc.edu/pphw-cal
>
> Computational Nuclear Engineering Research Group
> cnerg.engr.wisc.edu
>
>

_______________________________________________
Discuss mailing list
[email protected]
http://lists.software-carpentry.org/listinfo/discuss

Re: [Discuss] writing unit tests for scientific software

Reply via email to