Hi everyone- I'm currently teaching a graduate course in statistics for linguistics using R. I have used up most of the 'authentic' data I have been able to collect for homework and demonstrations. I can think of plenty more possible data sets, but I am finding the creation of them challenging, and my creations are often somewhat unlealistic (generally, too 'neat' and obvious).
So, I was wondering if anyone had any tips on creating 'realistic' data sets, or links/books that describe it. For a simple example, let's say I want to create a dataset with students from different countries and academic departments who took an English test. I want to make some differences (significant and not) and possibly even interactions among the scores by country and department. I have been doing this through various iterations of sample() and rnorm(), and jitter() to get some randomness, but things are still coming out pretty neatly. Is this the right (or a good) method? Advice? Thanks in advance- SFK -- Scott F. Kiesling, PhD Associate Professor Department Chair Department of Linguistics University of Pittsburgh, 2816 CL Pittsburgh, PA 15260 http://www.linguistics.pitt.edu Office: +1 412-624-5916 _______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
