Re: when to use nonparametric statistics

jackson marshmallow Wed, 03 Dec 2003 06:20:14 -0800

"Bill Rowe" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> In article <[EMAIL PROTECTED]>,
>  "jackson marshmallow" <[EMAIL PROTECTED]> wrote:
>
>
> > 1) Two samples of are given and I need to compare their means and
variances.
> > The distribution of the population is unknown. Can I use the F-test and
the
> > t-test? Is it necessary that the sample _means_ have a Gaussian
> > distribution? Is it sufficient? Maybe I misunderstand something here...
>
> Both the t-test and F-test assume samples are from a normal (Gaussian)
> distribution.  The t-test is reasonably robust, i.e., the sample
> distribution needs to deviate quite a bit from normal before conclusions
> based on the t-test are likely to be invalid. But, the F-test is more
> senistive to deviations from normality and probably shouldn't be used if
> there is reason to suspect the sample distribution isn't normal.
>


First of all, thanks for your reply.

I thought that normality of the distriution of means was enough for these
tests, but now that I have thought about this a llittle more see that I was
wrong... For example, as I understand, the t-statistic is essentially the
ratio of sample mean and estimated standard error of sample mean; so,
strictly speaking, for the arbitrary distribution of values in the sample
the estimated standard error could be anything and this statistic is
meaningless...

> > 2) I need to calculate the significance of correlation between two
> > sequences. I would actually prefer to use randomization, but the
sequences
> > may be too short. Another option is to perform linear regression and
> > calculate the significance of the slope using a t-test (?). When is it
> > valid?
>
> Linear regression (least squares) assumes a model of the form
>
> y_n = m x_n + b  + e_n
>
> where m and b are the desired regression parameters and e_n is the error
> associated with observation y_n. Further, it is assumed the e_n are from
> a normal distribution.
>
> It isn't clear to me whether this model is applicable to your problem
> with sequences.
>

Let's forget sequences for now... Suppose y_n = f(x_n) + e'_n and f is some
monotonic function that is actually non-linear, and e' is normally
distributed. Then e from linear regression will not be normally disrtibuted
(?) and I cannot use the t-test to find the significance of correlation.


.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: when to use nonparametric statistics

Reply via email to