Re: when to use nonparametric statistics

jackson marshmallow Wed, 03 Dec 2003 06:21:36 -0800

"Herman Rubin" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> In article <[EMAIL PROTECTED]>,
> jackson marshmallow <[EMAIL PROTECTED]> wrote:
> >Hello everyone,
>
> >I hope I can get simple answers to these questions... I need to solve a
> >couple of practical problems and I'm new to statistics...
>
> >1) Two samples of are given and I need to compare their means and
variances.
> >The distribution of the population is unknown. Can I use the F-test and
the
> >t-test? Is it necessary that the sample _means_ have a Gaussian
> >distribution? Is it sufficient? Maybe I misunderstand something here...
>
> If the population is not normal, the sample means CANNOT
> have a normal distribution.  However, it gets closer to
> normal with increasing sample size.
>


For practical purposes, aside from the restrictions of the central limit
theorem, I thought I could consider the distribution of sample means to be
normal.

> The t-test is approximate if the data is not normal.  Is
> the error in the significance levels more important than
> using bad levels in the first place?  As for the F-test,

What do you mean by bad levels?

> this is quite sensitive to the distribution.  Again, just
> what are you after?  Do you want means and variances at
> all?  Or do you want something else?
>

Well, first of all I wanted to gain a better understanding of statistical
analysis. I'm writing some software to analyze experimental data at the
company where I work... I have implemented some (basic) statistical methods,
but more needs to be done...

In some cases I will definitely have to compare means.

One of the problems where I may need  to compare variances is curve
fitting -- there are some theories which predict different curves for these
data... So, to be more specific, estimated error variances may need to be
compared. The sticking point for me is that if a curve is "wrong", then I
think it's reasonable to expect that the distribution of residuals will not
be normal...

Also, one of the complications is that one of the curves we need to try has
as many parameters as there are distinct values of the indepent variable
from available experimental data, so in a sense there are 0 degrees of
freedom.

>From what I have read so far, there doesn't seem to be a great universal
criterion of goodness of fit...


> If you want to test equality of distributions, a nonparametic
> test might be a good idea.  As you express an interest in
> comparing variances, I would suggest the Kuiper test rather
> than the Kolmogorov-Smirnov test.  It is almost as good as
> a test for location, and much better for scale.  For a
> decision understanding of this, I suggest my paper with
> Sethuraman in Sankhya 1965, and my paper in the Sixth
> Berkeley Symposium.

I don't need to test equality of distributions, at least as far as I
understand.

>
> >2) I need to calculate the significance of correlation between two
> >sequences. I would actually prefer to use randomization, but the
sequences
> >may be too short. Another option is to perform linear regression and
> >calculate the significance of the slope using a t-test (?). When is it
> >valid?
>
> >Again, I'm looking for simple answers, if they exist... Thanks in
advance!
>
> If there is not a problem of dependence between different
> points, the Spearman rank correlation or Kendall tau might
> be a good idea.
>
>

Here I think I have reinvented the wheel already... For one of the analyses
I worked out my own non-parametric method, but of course it turned out to
have been invented already -- apparently it's known as Somer's D and is a
variant of Kendall's tau.


Thank you very much for your reply!

> --
> This address is for information only.  I do not claim that these views
> are those of the Statistics Department or of Purdue University.
> Herman Rubin, Department of Statistics, Purdue University
> [EMAIL PROTECTED]         Phone: (765)494-6054   FAX: (765)494-0558


.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: when to use nonparametric statistics

Reply via email to