I find that most people new to data analysis don’t really “get” testing.

However, if you frame it as controls, you might get through!

So, maybe:

“You should run your data analysis script on some data where you’re pretty sure 
you know the answer. This could be because you simulated the data, or because 
it’s from a published report*, or because you went in and tweaked the data 
manually so as to make fake data with a specific effect. Then make sure you see 
the effect you wanted to see!”

After that most of the advice will be context specific. But I’ve found students 
(and advisors!) kinda get the controls question. Good advisors will already 
have asked these questions of their students, of course ;).

cheers,
—titus

> On May 10, 2017, at 9:48 AM, Naupaka Zimmerman <[email protected]> wrote:
> 
> Hi all,
> 
> I'm been thinking recently about the best way to incorporate testing 
> (regression/unit/etc) into routine data analysis scripts, both for my own 
> work and when teaching (e.g. a graduate-level bioinformatics class).
> 
> Conceptually it seems straightforward to incorporate tests when developing a 
> package or series of functions meant for reuse, but I am wondering if there 
> is a community-endorsed best-practice way to incorporate this defensive 
> programming mentality into more ad-hoc analyses.
> 
> I'm most familiar with the defensive programming approaches in the R world 
> (stopifnot, testthat, assertr), but I'm most interested in general answers to 
> the question.
> 
> We had some discussion about this on an issue in the r-novice-gapminder repo 
> a while back.
> 
> Thanks in advance for any input!
> 
> Best,
> Naupaka
> 
> _______________________________________________
> Discuss mailing list
> [email protected]
> http://lists.software-carpentry.org/listinfo/discuss

_______________________________________________
Discuss mailing list
[email protected]
http://lists.software-carpentry.org/listinfo/discuss

Reply via email to