"Bill Rowe" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > In article <[EMAIL PROTECTED]>, > "jackson marshmallow" <[EMAIL PROTECTED]> wrote: > > > > 1) Two samples of are given and I need to compare their means and variances. > > The distribution of the population is unknown. Can I use the F-test and the > > t-test? Is it necessary that the sample _means_ have a Gaussian > > distribution? Is it sufficient? Maybe I misunderstand something here... > > Both the t-test and F-test assume samples are from a normal (Gaussian) > distribution. The t-test is reasonably robust, i.e., the sample > distribution needs to deviate quite a bit from normal before conclusions > based on the t-test are likely to be invalid. But, the F-test is more > senistive to deviations from normality and probably shouldn't be used if > there is reason to suspect the sample distribution isn't normal. >
First of all, thanks for your reply. I thought that normality of the distriution of means was enough for these tests, but now that I have thought about this a llittle more see that I was wrong... For example, as I understand, the t-statistic is essentially the ratio of sample mean and estimated standard error of sample mean; so, strictly speaking, for the arbitrary distribution of values in the sample the estimated standard error could be anything and this statistic is meaningless... > > 2) I need to calculate the significance of correlation between two > > sequences. I would actually prefer to use randomization, but the sequences > > may be too short. Another option is to perform linear regression and > > calculate the significance of the slope using a t-test (?). When is it > > valid? > > Linear regression (least squares) assumes a model of the form > > y_n = m x_n + b + e_n > > where m and b are the desired regression parameters and e_n is the error > associated with observation y_n. Further, it is assumed the e_n are from > a normal distribution. > > It isn't clear to me whether this model is applicable to your problem > with sequences. > Let's forget sequences for now... Suppose y_n = f(x_n) + e'_n and f is some monotonic function that is actually non-linear, and e' is normally distributed. Then e from linear regression will not be normally disrtibuted (?) and I cannot use the t-test to find the significance of correlation. . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
