Dear Scott, I have a strong preference for using real data. That said, one strategy for manufacturing data is to simulate a statistical model, either the same model that will be used to analyze the data (in which case the latter is the "true model") or a different model. For example, if you want some terms in the model to be "non-significant," you'll get that with high probability if these are omitted from the model used to generate the data. Similarly, you can generate outliers by sampling errors from a heavy-tailed or suitable mixture distribution.
I hope this helps, John > -----Original Message----- > From: [email protected] [mailto:r-sig-teaching-boun...@r- > project.org] On Behalf Of Scott F. Kiesling > Sent: March-24-09 4:04 PM > To: [email protected] > Subject: [R-sig-teaching] Creating data > > Hi everyone- > > I'm currently teaching a graduate course in statistics for linguistics > using R. I have used up most of the 'authentic' data I have been able > to collect for homework and demonstrations. I can think of plenty more > possible data sets, but I am finding the creation of them challenging, > and my creations are often somewhat unlealistic (generally, too > 'neat' and obvious). > > So, I was wondering if anyone had any tips on creating 'realistic' > data sets, or links/books that describe it. > > For a simple example, let's say I want to create a dataset with > students from different countries and academic departments who took an > English test. I want to make some differences (significant and not) > and possibly even interactions among the scores by country and > department. I have been doing this through various iterations of > sample() and rnorm(), and jitter() to get some randomness, but things > are still coming out pretty neatly. Is this the right (or a good) > method? Advice? > > Thanks in advance- > > SFK > > -- > Scott F. Kiesling, PhD > > Associate Professor > Department Chair > > Department of Linguistics > University of Pittsburgh, 2816 CL > Pittsburgh, PA 15260 > http://www.linguistics.pitt.edu > Office: +1 412-624-5916 > > _______________________________________________ > [email protected] mailing list > https://stat.ethz.ch/mailman/listinfo/r-sig-teaching _______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
