"Herman Rubin" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > In article <[EMAIL PROTECTED]>, > jackson marshmallow <[EMAIL PROTECTED]> wrote: > >Hello everyone, > > >I hope I can get simple answers to these questions... I need to solve a > >couple of practical problems and I'm new to statistics... > > >1) Two samples of are given and I need to compare their means and variances. > >The distribution of the population is unknown. Can I use the F-test and the > >t-test? Is it necessary that the sample _means_ have a Gaussian > >distribution? Is it sufficient? Maybe I misunderstand something here... > > If the population is not normal, the sample means CANNOT > have a normal distribution. However, it gets closer to > normal with increasing sample size. >
For practical purposes, aside from the restrictions of the central limit theorem, I thought I could consider the distribution of sample means to be normal. > The t-test is approximate if the data is not normal. Is > the error in the significance levels more important than > using bad levels in the first place? As for the F-test, What do you mean by bad levels? > this is quite sensitive to the distribution. Again, just > what are you after? Do you want means and variances at > all? Or do you want something else? > Well, first of all I wanted to gain a better understanding of statistical analysis. I'm writing some software to analyze experimental data at the company where I work... I have implemented some (basic) statistical methods, but more needs to be done... In some cases I will definitely have to compare means. One of the problems where I may need to compare variances is curve fitting -- there are some theories which predict different curves for these data... So, to be more specific, estimated error variances may need to be compared. The sticking point for me is that if a curve is "wrong", then I think it's reasonable to expect that the distribution of residuals will not be normal... Also, one of the complications is that one of the curves we need to try has as many parameters as there are distinct values of the indepent variable from available experimental data, so in a sense there are 0 degrees of freedom. >From what I have read so far, there doesn't seem to be a great universal criterion of goodness of fit... > If you want to test equality of distributions, a nonparametic > test might be a good idea. As you express an interest in > comparing variances, I would suggest the Kuiper test rather > than the Kolmogorov-Smirnov test. It is almost as good as > a test for location, and much better for scale. For a > decision understanding of this, I suggest my paper with > Sethuraman in Sankhya 1965, and my paper in the Sixth > Berkeley Symposium. I don't need to test equality of distributions, at least as far as I understand. > > >2) I need to calculate the significance of correlation between two > >sequences. I would actually prefer to use randomization, but the sequences > >may be too short. Another option is to perform linear regression and > >calculate the significance of the slope using a t-test (?). When is it > >valid? > > >Again, I'm looking for simple answers, if they exist... Thanks in advance! > > If there is not a problem of dependence between different > points, the Spearman rank correlation or Kendall tau might > be a good idea. > > Here I think I have reinvented the wheel already... For one of the analyses I worked out my own non-parametric method, but of course it turned out to have been invented already -- apparently it's known as Somer's D and is a variant of Kendall's tau. Thank you very much for your reply! > -- > This address is for information only. I do not claim that these views > are those of the Statistics Department or of Purdue University. > Herman Rubin, Department of Statistics, Purdue University > [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
