[EMAIL PROTECTED] (David C. Howell) wrote in message news:<[EMAIL PROTECTED]>... > On one of my web pages I describe an algoritm for generating data with a > specified correlation between X and Y. The important part of the algorithm is > > * Use the normal random number function available in almost all > software to generate two random variables (X and Y). > * Standardize these variables to mean = 0, sd = 1. > * Calculate a = r/sqrt(1-r2), where r is the desired correlation. > * Calculate Z = a*X + Y. > * Adjust the means and variances of X and Z to what you want them to be > by simple linear transformations--(e.g., Xnew = Xold*NewSD + NewMean). > * Now the correlation between X and Z will be r. > * The mean of z will be 0.00, and its stand deviation will be sqrt(a2 + > 1). > * If you don't standardize the variables I would assume that the > resulting r will come from a population where rho = r, but I haven't worked > this out. If anyone knows for sure, I'd appreciate hearing. > I have recently been asked for the source of that algorithm. It has been > around for a long time, and I am certainly not the first to recommend it, > but I do not know its source. Can anyone help? > > Also, does anyone have an opinion about the last item in that list? > Yes, omitting the preliminary standarization will give you a sample from a population whose correlation is r; the sample correlation will generally not equal r.
See Abramowitz & Stegun, sec 26.8.6.b, p 953, for a similar algorithm (that omits the preliminary standardization). . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
