I find that most people new to data analysis don’t really “get” testing.
However, if you frame it as controls, you might get through! So, maybe: “You should run your data analysis script on some data where you’re pretty sure you know the answer. This could be because you simulated the data, or because it’s from a published report*, or because you went in and tweaked the data manually so as to make fake data with a specific effect. Then make sure you see the effect you wanted to see!” After that most of the advice will be context specific. But I’ve found students (and advisors!) kinda get the controls question. Good advisors will already have asked these questions of their students, of course ;). cheers, —titus > On May 10, 2017, at 9:48 AM, Naupaka Zimmerman <[email protected]> wrote: > > Hi all, > > I'm been thinking recently about the best way to incorporate testing > (regression/unit/etc) into routine data analysis scripts, both for my own > work and when teaching (e.g. a graduate-level bioinformatics class). > > Conceptually it seems straightforward to incorporate tests when developing a > package or series of functions meant for reuse, but I am wondering if there > is a community-endorsed best-practice way to incorporate this defensive > programming mentality into more ad-hoc analyses. > > I'm most familiar with the defensive programming approaches in the R world > (stopifnot, testthat, assertr), but I'm most interested in general answers to > the question. > > We had some discussion about this on an issue in the r-novice-gapminder repo > a while back. > > Thanks in advance for any input! > > Best, > Naupaka > > _______________________________________________ > Discuss mailing list > [email protected] > http://lists.software-carpentry.org/listinfo/discuss _______________________________________________ Discuss mailing list [email protected] http://lists.software-carpentry.org/listinfo/discuss
