Thanks everyone for the discussion on this very important topic. I think we should move this discussion off-list, but I'm not sure where to move it.
We've had "Best practices" blog posts and papers in the past, I think that a blog post summary would be a great conclusion. Is there a volunteer to summarize all this great info and resources? Sincerely, Cam From: Discuss [mailto:[email protected]] On Behalf Of Gamblin, Todd Sent: Tuesday, July 18, 2017 6:42 PM To: Terri Yu <[email protected]> Cc: Software Carpentry Discussion <[email protected]> Subject: Re: [Discuss] writing unit tests for scientific software Terri: you might be interested in services like codecov: https://codecov.io/ It will automatically display coverage reports you set up your CI service (e.g. Travis) to submit them. Travis integration is pretty simple. We use codecov for Spack, and it has a nice Chrome plugin you can use to see coverage diffs in pull requests. You can also use it to add static checks to enforce that every PR submitted has a certain amount of test coverage. Spack repo with a coverage badge is here: https://github.com/LLNL/spack And the corresponding codecov visualization is here: https://codecov.io/gh/LLNL/spack Or you can look at it by file and drill down: https://codecov.io/gh/LLNL/spack/list/develop/ -Todd On Jul 18, 2017, at 5:26 PM, Terri Yu <[email protected]<mailto:[email protected]>> wrote: Hi all, I realized that I have been doing a lot of the right things already, according to Titus and others. In retrospect, the reason I started this thread was that I couldn't find much writing on this topic and wanted some reassurance that I was on the right track. I thought it might be useful to share what I've already been doing in my unit tests. 1) I use the Python package "coverage". To make things easier, I wrote a bash script that measures coverage for the unit tests and displays the results for the relevant source code. I added a couple tests based on missing coverage. 2) My software uses a command line interface based on the Python standard library argparse. I have a lot of unit tests that simply checking input and output from running commands. A lot of it is simply checking for bad / invalid arguments., Or checking output for stuff like, if I asked for 4 harmonics, I should get 4 columns of harmonic numbers 3) Since the software I'm writing is a Python port of Matlab scripts, I check numerical results from my Python code against the output from the Matlab scripts. I extended the unittest TestCase class with a specialized assert function that uses NumPy's allclose() function to compare floating point results. I couldn't use the NumPy testing module directly because NumPy uses nose as their test framework, whereas I'm using unittest. 4) I do check a few numbers output from algorithms in my unit tests and I was wondering if there was any point in that, but the slide in Hans's presentation makes the good point that it's useful for checking earlier versions of the code against itself. Also, since I want the software to be cross-platform on Windows/Mac/Linux, it's also a good way to check that all platforms are giving consistent results. 5) I also have a few unit tests where the parameters to an algorithm are changed from the default and the numbers are checked. This is useful for showing that the numbers change when the parameters change, though there is no assurance that the numbers are actually "correct". I wish I had some beautiful example with a known analytical result like the 2D Ising model, but I don't think I have any cases like that for the algorithms I'm using. 6) I have a couple helper functions that have simple exact answers, so I can write unit tests to check for accurate results. For example, I have a helper function that rounds numbers according to the "half away from zero" method. Sort of like checking "2+2 = 4" as Titus mentioned. 6) Anywhere that I throw an exception in the source code, I try to make sure there's a matching unit test that purposely tries to trigger the exception and checks that the correct exception was raised. Thanks everyone for your help, and hope everyone is learning useful things from this thread. Terri On Tue, Jul 18, 2017 at 9:38 AM, Paul Wilson <[email protected]<mailto:[email protected]>> wrote: Hi Terri, I'll briefly add that testing is asymptotic (as suggested by Titus below), so it may be difficult to have "every" test. We rely on code review to help identify missing tests, particularly for new code - but also for older code. Paul On 07/17/2017 11:11 AM, C. Titus Brown wrote: Hi Terri, I think lots of people in the scientific Python community write their own algorithms and test them. But it's hard to give generic advice here, I think, because it's so dependent on your algorithm. Here's my try / our approach that has worked well for us over the last decade or so. * first, write automated "smoke" tests that check to see if your code is basically running/working. They should be as dumb and robust as possible. (e.g. the equivalent of "check that 2+2 = 4"). These are by far the most important in my experience, in that they deliver the most value for the least effort. * set up CI on those tests. * check code coverage of your code base, and try to get it to 30-40% by testing the basic code paths. * write a series of basic tests for edge cases (divide by zero, boundary conditions, that kind of thing), trying to cover another 10-20%. * as your code base matures and complexifies, write tests for new functionality and try to cover old functionality as well. Here code coverage is your friend in terms of targeting effort. * whenever you discover a bug, write a test against that bug before fixing it. That way your most error prone bits will get more coverage adaptively. I call this "stupidity driven testing." Lather, rinse, repeat. tl; dr? smoke tests, code coverage analysis, test against buggy code. best, --titus On Mon, Jul 17, 2017 at 11:50:59AM -0400, Terri Yu wrote: Thanks everyone, those are interesting resources for testing in general. I'm using Python's unittest framework and everything is already set up. The specific problem I need help with is what tests to write, in order to test numerical floating point output from algorithms. Given the responses I've gotten, it seems like not many people write their own algorithms and/or test them. Terri On Sun, Jul 16, 2017 at 5:50 PM, Jeremy Gray <[email protected]<mailto:[email protected]>> wrote: Hi Terri, It might also be worth checking out the workshop from this years pycon from Eria ma: Best Testing Practices for Data Science, on yotube here - https://www.youtube.com/watch?v=yACtdj1_IxE The github repo is here: https://github.com/ericmjl/data-testing-tutorial Cheers, Jeremy On Fri, Jul 14, 2017 at 5:21 PM, Olav Vahtras <[email protected]<mailto:[email protected]>> wrote: Dear Terri In addition I can recommend the following resource: pythontesting.net<http://pythontesting.net/> has a podcast series on testing and more, check out the new book on pytest by the site maintainer Brian Okken Regards Olav Olav 14 juli 2017 kl. 21:36 skrev Ashwin Srinath <[email protected]<mailto:[email protected]>>: If you're using Python, numpy.testing has the tools you'll need: https://docs.scipy.org/doc/numpy/reference/routines.testing.html There's also pandas.testing for testing code that uses Pandas. Thanks, Ashwin On Fri, Jul 14, 2017 at 3:27 PM, Terri Yu <[email protected]<mailto:[email protected]>> wrote: Hi everyone, Are there any resources that explain how to write unit tests for scientific software? I'm writing some software that processes audio signals and there are many parameters. I'm wondering what's the best way to test floating point numeric results. Do I need to test every single parameter? How can I verify accuracy of numeric results... use a different language / library? I would like to do a good job of testing, but I also don't want to write a bunch of semi-useless tests that take a long time to run. I would appreciate any thoughts you have. Thank you, Terri _______________________________________________ Discuss mailing list [email protected]<mailto:[email protected]> http://lists.software-carpentry.org/listinfo/discuss _______________________________________________ Discuss mailing list [email protected]<mailto:[email protected]> http://lists.software-carpentry.org/listinfo/discuss _______________________________________________ Discuss mailing list [email protected]<mailto:[email protected]> http://lists.software-carpentry.org/listinfo/discuss _______________________________________________ Discuss mailing list [email protected]<mailto:[email protected]> http://lists.software-carpentry.org/listinfo/discuss -- -- ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ -- Paul P.H. Wilson Grainger Professor of Nuclear Engineering 608-263-0807<tel:608-263-0807> [email protected]<mailto:[email protected]> 443 Engineering Research Bldg 1500 Engineering Dr, Madison, WI 53706 calendar: http://go.wisc.edu/pphw-cal Computational Nuclear Engineering Research Group cnerg.engr.wisc.edu<http://cnerg.engr.wisc.edu/> _______________________________________________ Discuss mailing list [email protected]<mailto:[email protected]> http://lists.software-carpentry.org/listinfo/discuss
_______________________________________________ Discuss mailing list [email protected] http://lists.software-carpentry.org/listinfo/discuss
