On Wed, 02 Apr 2003 22:28:59 GMT, "Arthur J. Kendall" <[EMAIL PROTECTED]> wrote:
> The following SPSS syntax show hows to generate a sample with any mean > and sd and then rescale to a given mean and sd. > I just tried 3 kind of random variable functions from a couple dozen or > so that are available in SPSS > > I can't think offhand how to adjust for skewness and kurtosis. > Why do you want to do this? It is my opinion that the original request was mis-aimed, and we ought to show what *should* have been asked. This is from the note that started this thread: " Is it possible to generate a sample that has a specific mean, SD (also skewness, kurtosis)? To be more specific, I do not want to sample from a given distribution like the normal distribution) that has a specific mean and SD. If sampled from a normal distribution with a given mean and SD, of course the mean of the sample is not the exact same as the mean of the normal distribution." This is mainly a reply to that person - even though I have dropped the attribution, sorry. Okay. When we are doing 10,000 randomizations to validate, we *never* would ask for the exact mean and SD. That is not wrong in an obvious way; but it is not the way it is done. Right? The *real-world* does not give us exact means.... The only time to feed in one "exact" randomized set, would be for some sort of, say, "calibration". Next: The question about fixing the skewness and kurtosis reminds me that someone (probably) is trying to show how some procedure works, under various conditions. All right: Published claims of robustness do *not* show how something works with X amount of skewness as the primary thing. Instead, they show the results for two or three *kinds* of simple distributions, or for contaminated distributions. For example: Here is how test T works for Normal, with large N and small N; how it works for uniform; how it works with a dichotomy; how it works for data that are exponential; how it works with a Normal (0,1) that is contaminated with 10% of its cases coming from a second Normal where the variance is 10. (John Tukey liked to model "contamination" like this, claiming is was a realistic hazard for real data.) Some Monte Carlo data once showed me that skewness that was lognormal (from taking exp(x), for normal(x)) damaged my subsequent testing *more* than the same measured skewness when it was from squaring (taking (x+C)squared). But I didn't think about that kurtosis at all. Anyway, if you want to do randomizations that are publishable -- or comparable to other people's -- you want to come closer to matching what is in the literature. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
